[Intel-gfx] [PATCH] drm/i915: don't bail out of intel_wait_ring_buffer too early
Chris Wilson
chris at chris-wilson.co.uk
Tue Oct 18 17:24:40 CEST 2011
On Tue, 11 Oct 2011 19:25:57 +0200, Daniel Vetter <daniel.vetter at ffwll.ch> wrote:
> In the pre-gem days with non-existing hangcheck and gpu reset code,
> this timeout of 3 seconds was pretty important to avoid stuck
> processes.
>
> But now we have the hangcheck code in gem that goes to great length
> to ensure that the gpu is really dead before declaring it wedged.
>
> So there's no need for this timeout anymore. Actually it's even harmful
> because we can bail out too early (e.g. with xscreensaver slip)
> when running giant batchbuffers. And our code isn't robust enough
> to properly unroll any state-changes, we pretty much rely on the gpu
> reset code cleaning up the mess (like cache tracking, fencing state,
> active list/request tracking, ...).
>
> With this change intel_begin_ring can only fail when the gpu is
> wedged, and it will return -EAGAIN (like wait_request in case the
> gpu reset is still outstanding).
>
> v2: Chris Wilson noted that on resume timers aren't running and hence
> we won't ever get kicked out of this loop by the hangcheck code. Use
> an insanely large timeout instead for the HAS_GEM case to prevent
> resume bugs from totally hanging the machine.
>
> Signed-off-by: Daniel Vetter <daniel.vetter at ffwll.ch>
Reviewed-by: Chris Wilson <chris at chris-wilson.co.uk>
Personally I would have done:
end = jiffies;
if (has_gem)
end += infinity;
else
end += 3s;
but that is one nitpick too far ;-)
-Chris
--
Chris Wilson, Intel Open Source Technology Centre
More information about the Intel-gfx
mailing list