[Intel-gfx] [PATCH] drm/i915: Upgrade execbuffer fail after resume failure to EIO
Daniel Vetter
daniel at ffwll.ch
Tue Mar 25 16:36:05 CET 2014
On Tue, Mar 25, 2014 at 08:15:54AM +0000, Chris Wilson wrote:
> On Tue, Mar 25, 2014 at 09:07:00AM +0100, Daniel Vetter wrote:
> > On Tue, Mar 25, 2014 at 9:03 AM, Chris Wilson <chris at chris-wilson.co.uk> wrote:
> > > If we try to execute on a known ring, but it has failed to be
> > > initialised correctly, report that the GPU is hung rather than the
> > > command invalid. This leaves us reporting EINVAL only if the user
> > > requests execution on a ring that is not supported by the device.
> > >
> > > This should prevent UXA from getting stuck in a null render loop after a
> > > failed resume.
> > >
> > > Reported-by: Jiri Kosina <jikos at jikos.cz>
> > > References: https://bugs.freedesktop.org/show_bug.cgi?id=76554
> > > Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
> >
> > Feels a bit like duct-tape ... Shouldn't we instead stop tearing down
> > ringbuffer structures over suspend/resume? And maybe the same for init
> > with your patch applied.
>
> Even if we did, we would still end up with g45 failing to restart
> the ring at random, so we need some w/a.
>
> > Or we simply check for wedged first thing in execbuf ...
>
> See the first 2 patches ;-) The first is actually essential as we have
> no other guard against writing into the non-existent ring.
>
> I thought about doing that. However, I prefer doing arg validation
> first i.e. get all (or as much as is feasible) of the EINVAL checks out
> of the way so that we avoid touching hardware or leaking any information
> to a malicious client. The problem in this case is where is not actually
> an invalid arg.
>
> Note that we will detect the EIO later before touching hardware (so long
> as the first two patches are in place).
Yeah I've seen the other patches. I think we should try to keep all the
ring structures around even when the hw init failed. I've made some feeble
attempts a while ago to split the structure init from the hw init stuff,
but kinda never fully materialized ...
Imo if our set of valid rings semi-randomly changes at runtime even,
that's not good.
-Daniel
--
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
More information about the Intel-gfx
mailing list