[Intel-gfx] [PATCH] drm/i915: Re-enable per-engine reset for Broxton

Chris Wilson chris at chris-wilson.co.uk
Wed Sep 6 15:56:33 UTC 2017


Quoting Michel Thierry (2017-09-06 16:25:06)
> On 05/09/17 06:57, Chris Wilson wrote:
> > Quoting Chris Wilson (2017-08-21 15:55:34)
> >> Quoting Michel Thierry (2017-08-18 18:23:42)
> >>> The corruption in CSB mmio reads we were seeing has been tracked down to
> >>> incorrectly touching forcewake of all domains, following an engine reset.
> >>> It is still a mistery why we only catched this in Broxton, since it
> >>> could happen in any platform.
> >>>
> >>> With that fix already merged, commit 4055dc75d6b5 ("drm/i915: Stop
> >>> touching forcewake following a gen6+ engine reset"), lets try to enable
> >>> per-engine resets in Broxton one more time.
> >>>
> >>> This reverts commit f188258bde0f ("drm/i915: Disable per-engine reset for
> >>> Broxton").
> >>>
> >>> Cc: Chris Wilson <chris at chris-wilson.co.uk>
> >>> Signed-off-by: Michel Thierry <michel.thierry at intel.com>
> >>
> >> My bxt has survived about 72 hours of hang testing, which is far more
> >> than it was able to previously.
> >>
> >> Acked-by: Chris Wilson <chris at chris-wilson.co.uk>
> >> Tested-by: Chris Wilson <chris at chris-wilson.co.uk>
> > 
> > Uh oh, seemingly just hit it again...
> 
> Was it because the CSBs were 0's?
> 
> A couple of times I saw a spurious CSB event (0x12 - preempted & 
> complete), after an already 'complete' event. That was also hitting the 
> assert because the ctx-id would be 'wrong'. I think we could ignore the 
> 0x12 event and it will continue.

Hmm, that 0x12 event has never triggered the invalid ctx id yet for me
(but that's probably just a matter of workload), it always hits the
too-many-switches.  Sadly we can't just continue on after that as the hw
is completely out-of-sync with our submissions, and the only way to
recover appears to be a gpu reset.

Anyway, haven't yet dug back into the bang, just reaffirmed that
disabling per-engine resets gives me a
ickle at broxton:~$ uptime
 16:55:31 up 1 day,  2:01,  2 users,  load average: 3.66, 3.38, 3.3
so far of drv_selftest --r live_hanghceck
-Chris


More information about the Intel-gfx mailing list