[Intel-gfx] [PATCH 1/3] drm/i915: Always sanity check engine state upon idling

Tue Aug 29 13:55:46 UTC 2017

Quoting Mika Kuoppala (2017-08-29 14:36:57)
> Chris Wilson <chris at chris-wilson.co.uk> writes:
> 
> > When we do a locked idle we know that afterwards all requests have been
> > completed and the engines have been cleared of tasks. For whatever
> > reason, this doesn't always happen and we may go into a suspend with
> > ELSP still full, and this causes an issue upon resume as we get very,
> > very confused.
> >
> > If the engines refuse to idle, mark the device as wedged. In the process
> > we get rid of the maybe unused open-coded version of wait_for_engines
> > reported by Nick Desaulniers and Matthias Kaehlcke.
> >
> > References: https://bugs.freedesktop.org/show_bug.cgi?id=101891
> > Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
> > Cc: Mika Kuoppala <mika.kuoppala at linux.intel.com>
> > Cc: Joonas Lahtinen <joonas.lahtinen at linux.intel.com>
> > Cc: Matthias Kaehlcke <mka at chromium.org>
> 
> I noticed that when actually do switch to kernel context, it's
> async. And then we always do wait for idle.
> 
> So as all our usage is sync, why don't we just wait the req in
> i915_gem_switch_to_kernel_context(i915) to pinpoint the request
> uncompletion. And in addition have this as a further harderning.

They are separate for historical reasons, i.e. they have been used
independently. Note that the switch to kernel context may be between 0
and one request per engine to wait upon, and yet we still want to wait.

However, we can move the wait-for-idle into switch-to-kernel-context as
that is common across all callers at present.

* spots an open coded switch to kernel context.
-Chris