[Intel-gfx] NULL pointer deferences in drm_mode_copy() and drm_crtc_index()

Michael Kaminsky kaminsky at cs.cmu.edu
Mon Jul 6 19:25:16 PDT 2015


On 07/06/2015 11:24 AM, Daniel Vetter wrote:
> On Fri, Jul 03, 2015 at 02:11:37PM -0400, Michael Kaminsky wrote:
>> I few days ago I built a kernel from git (commit 6aaf0da872), and
>> noticed a couple of NULL pointer deferences.  These seem to be
>> regressions as they aren't present in v4.1.
>>
>> I did a bisect between v4.1 and 6aaf0da872, and came up with the
>> following commit as the first bad one:
>>
>>   d5432a9d  drm/i915: Stage new modeset state straight into atomic state
>>
>> My laptop is a Thinkpad T540p.  The bug manifests itself specifically
>> when I'm connected to my dock.  Starting with this commit, when I plug
>> an external monitor into the dock and then unplug it, I get the NULL
>> pointer dereference in drm_mode_copy (see kernel trace #1 below).  The
>> bug happens during unplug.
>>
>> Plugging/unplugging the same monitor directly into my laptop doesn't
>> seem tickle the bug.  It also doesn't seem to matter which connector I
>> plug/unplug into on the dock (VGA, DP, etc.).
>>
>> This laptop/dock uses DP MST, so wonder if that's the problem.  An
>> external VGA monitor connected directly to my laptop shows up as output
>> VGA1, but when that same monitor is hooked up to the dock's VGA port, it
>> shows up as output DP2-3 (for example).
>>
>> That commit the first place where things seem to go wrong, but later
>> commits actually show a different, but possibly related NULL pointer
>> dereference in drm_crtc_index (see kernel trace #2 below).  In these
>> kernels, I don't even get to the point where I can unplug the monitor.
>> Instead, as soon as I connect two external monitors to my dock, a
>> NULL dereference occurs.  My initial tests show that it seems to
>> happen specifically with 2 external monitors, not 1, and when they are
>> connected to the dock, not the laptop itself.  This bug occurs in commit
>> 6aaf0da872 (my starting point), and I noticed it during my bisect in at
>> least commit 27a1b688, though it might first start occurring earlier.
>> I know that 0f63cca already has the first bug above (unplugging
>> monitor problem).  I suspect that the new problem probably starts
>> between those two commits, but I haven't had the chance to pinpoint
>> it--perhaps this info will be enough to identify the source of both
>> problems, but if not, I can try to dig deeper.
> 
> Yeah mst dp hotplugs connectors, and we've changed a few things in there.
> Can you please boot with drm.debug=0xe added to your kernel cmdline,
> reproduce each issue and the grab the complete kernel log for each case?
> It'll be really big but should help figuring out what's amiss.
> 
> Also please retest with latest drm-next or upstream linus, we've just
> merged a few patches to close some dp mst races.
> 
> Thanks, Daniel

Daniel,

I was able to do some quick testing with a recent upstream linus kernel
(commit 1c4c7159 -- basically one commit after v4.2-rc1).  To keep
things simple, I just tested with this one kernel for now.  This kernel
basically exhibits the second case I described above, but does so even
after attaching a single monitor.

I can trigger the bug as follows:  I boot with my laptop docked, but no
monitors attached to the dock.  Once the machine has booted fully, I
switch to a console.  (Switching to the console is just a convenience
so that I can see the kernel messages immediately; it doesn't seem to
affect the results.)  Then, I plug an external monitor into the VGA port
on the dock. (I also tried plugging a DP monitor into the dock and get
similar results.)

As soon as I plug in that single external monitor, the NULL pointer
dereference occurs.  It seems repeatable with this kernel.  The log is
here:   http://pastebin.com/dKgxfz4y

Thanks!

Michael


More information about the Intel-gfx mailing list