[Intel-gfx] [PATCH] drm/i915/selftests: Flush interrupts before disabling tasklets

Mika Kuoppala mika.kuoppala at linux.intel.com
Thu Oct 24 08:06:30 UTC 2019


Chris Wilson <chris at chris-wilson.co.uk> writes:

> Quoting Mika Kuoppala (2019-10-24 08:21:14)
>> Chris Wilson <chris at chris-wilson.co.uk> writes:
>> 
>> > When setting up the system to perform the atomic reset, we need to
>> > serialise with any ongoing interrupt tasklet or else:
>> >
>> > <0> [472.951428] i915_sel-4442    0d..1 466527056us : __i915_request_submit: rcs0 fence 11659:2, current 0
>> > <0> [472.951554] i915_sel-4442    0d..1 466527059us : __execlists_submission_tasklet: rcs0: queue_priority_hint:-2147483648, submit:yes
>> > <0> [472.951681] i915_sel-4442    0d..1 466527061us : trace_ports: rcs0: submit { 11659:2, 0:0 }
>> > <0> [472.951805] i915_sel-4442    0.... 466527114us : __igt_atomic_reset_engine: i915_reset_engine(rcs0:active) under hardirq
>> > <0> [472.951932] i915_sel-4442    0d... 466527115us : intel_engine_reset: rcs0 flags=11d
>> > <0> [472.952056] i915_sel-4442    0d... 466527117us : execlists_reset_prepare: rcs0: depth<-1
>> > <0> [472.952179] i915_sel-4442    0d... 466527119us : intel_engine_stop_cs: rcs0
>> > <0> [472.952305]   <idle>-0       1..s1 466527119us : process_csb: rcs0 cs-irq head=3, tail=4
>> 
>> Racing and this shows from old world?
>
> We have the same CSB events being seen by process_csb() on two different
> processors. One being issued by the reset in the test, the other by the
> interrupt; this scenario is supposed to be prevented by flushing the
> interrupt tasklet with tasklet_disable() before we enter the atomic
> reset -- but I copied the code to use tasklet_disable_nosync() that is
> meant to only used from inside the atomic reset after we had serialised
> (or know we are inside the tasklet) with the tasklet. Basically this bug
> is of our own invention because we are bypassing the usual setup in
> order to do engine->reset() from unusual conditions.

Some deepdiving into the trace format and tasklet_disable_nosync vs
tasklet_disable and I agree with the trace and the patch.

I don't know where you copied the nosync from but I did look
at preempt_reset and it can pull the nosync trick as it
is inside the submission.

Reviewed-by: Mika Kuoppala <mika.kuoppala at linux.intel.com>

> -Chris


More information about the Intel-gfx mailing list