[Intel-gfx] [PATCH] drm/i915: Upgrade execbuffer fail after resume failure to EIO

Chris Wilson chris at chris-wilson.co.uk
Tue Mar 25 09:15:54 CET 2014


On Tue, Mar 25, 2014 at 09:07:00AM +0100, Daniel Vetter wrote:
> On Tue, Mar 25, 2014 at 9:03 AM, Chris Wilson <chris at chris-wilson.co.uk> wrote:
> > If we try to execute on a known ring, but it has failed to be
> > initialised correctly, report that the GPU is hung rather than the
> > command invalid. This leaves us reporting EINVAL only if the user
> > requests execution on a ring that is not supported by the device.
> >
> > This should prevent UXA from getting stuck in a null render loop after a
> > failed resume.
> >
> > Reported-by: Jiri Kosina <jikos at jikos.cz>
> > References: https://bugs.freedesktop.org/show_bug.cgi?id=76554
> > Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
> 
> Feels a bit like duct-tape ... Shouldn't we instead stop tearing down
> ringbuffer structures over suspend/resume? And maybe the same for init
> with your patch applied.

Even if we did, we would still end up with g45 failing to restart
the ring at random, so we need some w/a.
 
> Or we simply check for wedged first thing in execbuf ...

See the first 2 patches ;-) The first is actually essential as we have
no other guard against writing into the non-existent ring.

I thought about doing that. However, I prefer doing arg validation
first i.e. get all (or as much as is feasible) of the EINVAL checks out
of the way so that we avoid touching hardware or leaking any information
to a malicious client. The problem in this case is where is not actually
an invalid arg.

Note that we will detect the EIO later before touching hardware (so long
as the first two patches are in place).
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre



More information about the Intel-gfx mailing list