[Intel-gfx] [PATCH 1/3] drm/i915: Always sanity check engine state upon idling

Chris Wilson chris at chris-wilson.co.uk
Tue Aug 29 13:19:47 UTC 2017


Quoting Joonas Lahtinen (2017-08-29 14:07:40)
> On Sat, 2017-08-26 at 12:09 +0100, Chris Wilson wrote:
> > When we do a locked idle we know that afterwards all requests have been
> > completed and the engines have been cleared of tasks. For whatever
> > reason, this doesn't always happen and we may go into a suspend with
> > ELSP still full, and this causes an issue upon resume as we get very,
> > very confused.
> > 
> > If the engines refuse to idle, mark the device as wedged. In the process
> > we get rid of the maybe unused open-coded version of wait_for_engines
> > reported by Nick Desaulniers and Matthias Kaehlcke.
> > 
> > References: https://bugs.freedesktop.org/show_bug.cgi?id=101891
> > Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
> > Cc: Mika Kuoppala <mika.kuoppala at linux.intel.com>
> > Cc: Joonas Lahtinen <joonas.lahtinen at linux.intel.com>
> > Cc: Matthias Kaehlcke <mka at chromium.org>
> 
> I assume GEM_WARN_ON -> DRM_ERROR was intentional.

Yes. The first time the unused function was reported, the thread drifted
off in the direction of "that we probably want to always do the test"
rather than only in CI. Now that we have seen glk actually fail in this
way, we need to code defensively here (as it is no longer a theoretical
programming error). As it still is a hw issue we want the warning
(especially as this will cause the suspend to fail, we want a reason
why) and as a general rule all wedging should indicate an error (because
it is a last resort around driver bugs).
-Chris


More information about the Intel-gfx mailing list