[Intel-gfx] [PATCH igt] igt/drv_hangman: Use manual error-state generation

Chris Wilson chris at chris-wilson.co.uk
Thu Oct 20 10:05:13 UTC 2016


On Thu, Oct 20, 2016 at 10:46:01AM +0100, Chris Wilson wrote:
> On Thu, Oct 20, 2016 at 11:29:05AM +0200, Daniel Vetter wrote:
> > On Thu, Oct 20, 2016 at 10:07:39AM +0100, Chris Wilson wrote:
> > > For the basic error state, we only desire that an error state be created
> > > following a hang. For that purpose, we do not need a real hang (slow
> > > 6-12s) but can inject one instead (fast <1s).
> > > 
> > > Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
> > 
> > Should we instead speed up hangcheck? I think there's lots of value in
> > making sure not just error dumping, but also hang detection works somewhat
> > in BAT. Since if it doesn't any attempt at a full run will lead to pretty
> > serious disasters. And I have this dream that BAT is the gating thing
> > deciding whether a patch series deserves a complete pre-merge run ;-)
> 
> We have full-hang detection in BAT elsewhere as well. This particular
> test was only asking the question "do we generate an error state", hence
> why I felt it was safe to just do that and skip a simulated hang.
>  
> > But since this is a controlled enviromnent we could make hangcheck
> > super-fast at timing out with some debugfs knob. Would probably also help
> > a lot with speeding up the gazillion of testcases in gem_reset_stats.
> 
> I have considered i915.hangcheck_interval_ms many a time. It is not just
> the interval but the hangcheck score threshold to consider. If we can
> trust our activity detection, we would be safe with a hangcheck every
> jiffie (at some overhead mind you), but we would declare a dos too soon.

Thinking of which, Mika did have some patches to move towards a time
accrued metric...
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre


More information about the Intel-gfx mailing list