[Intel-gfx] [PATCH 3/6] drm/i915: don't bail out of intel_wait_ring_buffer too early

Chris Wilson chris at chris-wilson.co.uk
Tue Oct 11 17:53:41 CEST 2011


On Tue, 11 Oct 2011 16:39:11 +0200, Daniel Vetter <daniel.vetter at ffwll.ch> wrote:
> In the pre-gem days with non-existing hangcheck and gpu reset code,
> this timeout of 3 seconds was pretty important to avoid stuck
> processes.
> 
> But now we have the hangcheck code in gem that goes to great length
> to ensure that the gpu is really dead before declaring it wedged.
> 
> So there's no need for this timeout anymore. Actually it's even harmful
> because we can bail out too early (e.g. with xscreensaver slip)
> when running giant batchbuffers. And our code isn't robust enough
> to properly unroll any state-changes, we pretty much rely on the gpu
> reset code cleaning up the mess (like cache tracking, fencing state,
> active list/request tracking, ...).
> 
> With this change intel_begin_ring can only fail when the gpu is
> wedged, and it will return -EAGAIN (like wait_request in case the
> gpu reset is still outstanding).
> 
> Signed-off-by: Daniel Vetter <daniel.vetter at ffwll.ch>

This makes me nervous, as it wasn't very long ago that we hit this
timeout during resume. (A bug nevertheless, but promoting such to an
infinite loop is worse...)

Can we just make the timeout insanely large in the HAS_GEM case?
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre



More information about the Intel-gfx mailing list