[Intel-gfx] [PATCH 1/5] drm: Kernel Crash in drm_unlock

Mon May 4 23:37:51 PDT 2015

On Mon, 2015-05-04 at 15:52 +0200, Daniel Vetter wrote:
> On Tue, Apr 28, 2015 at 10:52:32AM +0100, chris at chris-wilson.co.uk wrote:
> > On Tue, Apr 28, 2015 at 10:21:49AM +0100, Dave Gordon wrote:
> > > On 24/04/15 06:52, Antoine, Peter wrote:
> > > > I picked up this work due to the following Jira ticket created by the
> > > > security team (on Android) and was asked to give it a second look and
> > > > found a few more issues with the hw lock code.
> > > > 
> > > > https://jira01.devtools.intel.com/browse/GMINL-5388
> > > > I/O control on /dev/dri/card0 crashes the kernel (0x4008642b)
> > > > 
> > > > It also stops Linux as it kills the driver, I guess it might be possible
> > > > to reload the gfx driver. On a unpatched system the test that is
> > > > included in the issue or the igt test that has been posted for the issue
> > > > will show the problem.
> > > > 
> > > > I ran the test on an unpatched system here and the gui stopped and the
> > > > keyboard stopped responding, so I rebooted. With the patched system I
> > > > did not need to reboot.
> > > > 
> > > > Should I change the SIGTERM to SIGSEGV, not quite the same thing but
> > > > tooling is better at handling a segfault than a SIGTERM and the
> > > > application that calls this IOCTL is using an uninitialised hw lock so
> > > > it is kind of the same as differencing an uninitialised pointer (kind
> > > > of). Or, I could just remove it, but the bug has been in the code for at
> > > > least two years (and known about), and I would guess that any code that
> > > > is calling this is fuzzing the IOCTLs (as this is how the security team
> > > > found it) and we should reward them with a application exit.
> > > > 
> > > > Peter. 
> > > 
> > > SIGSEGV would be a better choice.
> > > 
> > > SIGTERM is normally sent by a user -- it's the default signal sent by
> > > kill(1). It's also commonly used to tell a long-running daemon process
> > > to tidy up and exit cleanly.
> > > 
> > > SIGSEGV commonly means "you accessed something that doesn't exist/isn't
> > > mapped/you don't have permissions for". There are specific subcases that
> > > can be indicated via the siginfo data; this is from the sigaction(1)
> > > manpage:
> > > 
> > >     The following values can be placed in si_code for a SIGSEGV signal:
> > > 
> > >         SEGV_MAPERR    address not mapped to object
> > > 
> > >         SEGV_ACCERR    invalid permissions for mapped object
> > > 
> > > SIGBUS would also be a possibility but that's generally taken to mean
> > > that an access got all the way to some physical bus and then faulted,
> > > whereas SIGSEGV suggests the access was rejected during the
> > > virtual-to-physical mapping process.
> > 
> > None of the above. Just return -EINVAL, -EPERM, -EACCESS as appropriate.
> 
> Seconded, we really don't want to be in the business of fixing up the drm
> design mistakes of the past 15 years. As long as we can fully lock out
> this particular dragon when running i915 we're imo good enough. The dri1
> design of a kernel shim driver cooperating with the ums driver for hw
> ownership is fundamentally unfixable.
> 
> Also we can't change any of it for drivers actually using it since it'll
> break them, which is a big no-go.
> -Daniel

I will remove it. But, If you are using this code path the driver/kernel
will have crashed. It covers a NULL pointer deference, so we are not
changing the API that anyone is actually using.

Peter.