[Intel-gfx] [PATCH 2/2] drm/i915: Filter out spurious execlists context-switch interrupts
Chris Wilson
chris at chris-wilson.co.uk
Mon Oct 23 21:12:55 UTC 2017
Quoting Chris Wilson (2017-10-23 21:06:16)
> Back in commit a4b2b01523a8 ("drm/i915: Don't mark an execlists
> context-switch when idle") we noticed the presence of late
> context-switch interrupts. We were able to filter those out by looking
> at whether the ELSP remained active, but in commit beecec901790
> ("drm/i915/execlists: Preemption!") that became problematic as we now
> anticipate receiving a context-switch event for preemption while ELSP
> may be empty. To restore the spurious interrupt suppression, add a
> counter for the expected number of pending context-switches and skip if
> we do not need to handle this interrupt to make forward progress.
Looking at an example from
https://intel-gfx-ci.01.org/tree/drm-tip/Trybot_1299/
the common case is where we still get the interrupt after already
parsing the whole CSB:
<6>[ 22.723238] i915 0000:00:02.0: [drm] vecs0
<6>[ 22.723246] i915 0000:00:02.0: [drm] current seqno 8, last 8, hangcheck 0 [-277277 ms], inflight 0
<6>[ 22.723260] i915 0000:00:02.0: [drm] Reset count: 0
<6>[ 22.723269] i915 0000:00:02.0: [drm] Requests:
<6>[ 22.723278] i915 0000:00:02.0: [drm] RING_START: 0x007fb000 [0x00000000]
<6>[ 22.723289] i915 0000:00:02.0: [drm] RING_HEAD: 0x00000278 [0x00000000]
<6>[ 22.723300] i915 0000:00:02.0: [drm] RING_TAIL: 0x00000278 [0x00000000]
<6>[ 22.723311] i915 0000:00:02.0: [drm] RING_CTL: 0x00003001 []
<6>[ 22.723322] i915 0000:00:02.0: [drm] ACTHD: 0x00000000_00000278
<6>[ 22.723333] i915 0000:00:02.0: [drm] BBADDR: 0x00000000_00000004
<6>[ 22.723343] i915 0000:00:02.0: [drm] Execlist status: 0x00000301 00000000
<6>[ 22.723355] i915 0000:00:02.0: [drm] Execlist CSB read 1 [1 cached], write 1 [1 from hws], interrupt posted? no
<6>[ 22.723370] i915 0000:00:02.0: [drm] ELSP[0] idle
<6>[ 22.723378] i915 0000:00:02.0: [drm] ELSP[1] idle
<6>[ 22.723387] i915 0000:00:02.0: [drm] HW active? 0x0
<6>[ 22.723402] i915 0000:00:02.0: [drm]
Those should not lead to hitting BUG_ON(gt.awake) though as the tasklet
is flushed before we clear gt.awake. Except if maybe the interrupt
arrives after the tasklet_kill...
Given that we wait for the engines to be idle before parking, we should
be safe enough with
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index bb0e85043e01..fa46137d431a 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -3327,6 +3327,8 @@ i915_gem_idle_work_handler(struct work_struct *work)
if (new_requests_since_last_retire(dev_priv))
goto out_unlock;
+ synchronize_irq(dev_priv->drm.irq);
+
/*
* We are committed now to parking the engines, make sure there
* will be no more interrupts arriving later.
to flush a pending irq and not worry about a multi-phase park.
-Chris
More information about the Intel-gfx
mailing list