[Intel-gfx] [PATCH] drm/i915: Fix system hang with EI UP masked on Haswell

Mika Kuoppala mika.kuoppala at linux.intel.com
Thu Apr 13 11:58:23 UTC 2017


Chris Wilson <chris at chris-wilson.co.uk> writes:

> On Thu, Apr 13, 2017 at 02:15:27PM +0300, Mika Kuoppala wrote:
>> Previously with commit a9c1f90c8e17
>> ("drm/i915: Don't mask EI UP interrupt on IVB|SNB") certain,
>> seemingly unrelated bit (GEN6_PM_RP_UP_EI_EXPIRED) was needed
>> to be unmasked for IVB and SNB in order to prevent system hang
>> with chained batchbuffers.
>> 
>> Our CI was seeing incomplete results with tests that used
>> chained batches and it was found out that HSW needs to have this
>> same bit unmasked to reliably survive chained batches.
>> 
>> Always unmask GEN6_PM_RP_UP_EI_EXPIRED on Haswell to
>> prevent system hang with batch chaining.
>> 
>> Testcase: igt/gem_exec_fence/nb-await-default
>> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100672
>> Cc: Chris Wilson <chris at chris-wilson.co.uk>
>> Cc: stable at vger.kernel.org
>> Signed-off-by: Mika Kuoppala <mika.kuoppala at intel.com>
>
> * facepalm.
>
> I am amazed that took so long for us to notice.

It could be that we don't have chained so much in CI.
Also it seems to be more subtle than with IVB. With
spin batch it didnt surface but with nb-await-default
the store/spin and possibly(?) the cpu side sleep
lured it out.

> Acked-by: Chris Wilson <chris at chris-wilson.co.uk>
Thanks.

>
> Did we ever get a w/a identifier for this?

Not that I know of. And in retrospect excluding
hsw was not wise in the original patch. It was v3
where it was excluded but I didn't find the trail that
lead there. Trusting it not to inherit the peculiarities...

I like to think that we tested and it never hung with
straight up busy chaining. nb-await-default is
more sophisticated.

-Mika


More information about the Intel-gfx mailing list