[Intel-gfx] [PATCH 2/2] drm/i915: Force reset on unready engine

Mika Kuoppala mika.kuoppala at linux.intel.com
Mon Aug 13 11:02:51 UTC 2018


Chris Wilson <chris at chris-wilson.co.uk> writes:

> Quoting Mika Kuoppala (2018-08-13 11:42:42)
>> If engine reports that it is not ready for reset, we
>> give up. Evidence shows that forcing a per engine reset
>> on an engine which is not reporting to be ready for reset,
>> can bring it back into a working order. There is risk that
>> we corrupt the context image currently executing on that
>> engine. But that is a risk worth taking as if we unblock
>> the engine, we prevent a whole device wedging in a case
>> of full gpu reset.
>> 
>> Reset individual engine even if it reports that it is not
>> prepared for reset, but only if we aim for full gpu reset
>> and not on first reset attempt.
>> 
>> v2: force reset only on later attempts, readability (Chris)
>> v3: simplify with adequate caffeine levels (Chris)
>> 
>> Cc: Chris Wilson <chris at chris-wilson.co.uk>
>> Signed-off-by: Mika Kuoppala <mika.kuoppala at linux.intel.com>
> Reviewed-by: Chris Wilson <chris at chris-wilson.co.uk>
>
> One last thing, you said you recalled one of the reasons for its
> existence was to prevent machine lockups on kbl. Is the recollection
> true? Do we want to leave a comment in case of fire?

We got machine lockups if we did reset a non stopped, active
engine inside a batchbuffer. i915_stop_engines() arise from that
and we have a comment in intel_gpu_reset explaining it. That lockup
did apparently happen regardless of ready-to-reset ack.

How I read it is that we got ready-to-reset acks on
active engines, which then died if we proceed. So this
patch should not make things worse as i915_stop_engines
have hold water.

-Mika *knocks on wood*



More information about the Intel-gfx mailing list