<div class="gmail_quote">On Wed, Dec 14, 2011 at 10:56, Daniel Vetter <span dir="ltr"><<a href="mailto:daniel.vetter@ffwll.ch">daniel.vetter@ffwll.ch</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
In the pre-gem days with non-existing hangcheck and gpu reset code,<br>
this timeout of 3 seconds was pretty important to avoid stuck<br>
processes.<br>
<br>
But now we have the hangcheck code in gem that goes to great length<br>
to ensure that the gpu is really dead before declaring it wedged.<br>
<br>
So there's no need for this timeout anymore. Actually it's even harmful<br>
because we can bail out too early (e.g. with xscreensaver slip)<br>
when running giant batchbuffers. And our code isn't robust enough<br>
to properly unroll any state-changes, we pretty much rely on the gpu<br>
reset code cleaning up the mess (like cache tracking, fencing state,<br>
active list/request tracking, ...).<br>
<br>
With this change intel_begin_ring can only fail when the gpu is<br>
wedged, and it will return -EAGAIN (like wait_request in case the<br>
gpu reset is still outstanding).<br>
<br>
v2: Chris Wilson noted that on resume timers aren't running and hence<br>
we won't ever get kicked out of this loop by the hangcheck code. Use<br>
an insanely large timeout instead for the HAS_GEM case to prevent<br>
resume bugs from totally hanging the machine.<br>
<br>
Signed-off-by: Daniel Vetter <<a href="mailto:daniel.vetter@ffwll.ch">daniel.vetter@ffwll.ch</a>><br>
Reviewed-by: Chris Wilson <<a href="mailto:chris@chris-wilson.co.uk">chris@chris-wilson.co.uk</a>><br>
Acked-by: Ben Widawsky <<a href="mailto:ben@bwidawsk.net">ben@bwidawsk.net</a>><br></blockquote><div><br><div><br>
Reviewed-by: Eugeni Dodonov <<a href="mailto:eugeni.dodonov@intel.com">eugeni.dodonov@intel.com</a>> <br>
<br>
</div></div></div>-- <br>Eugeni Dodonov<a href="http://eugeni.dodonov.net/" target="_blank"><br></a><br>