[Intel-gfx] [PATCH 2/2] drm/i915: Filter out spurious execlists context-switch interrupts

Chris Wilson chris at chris-wilson.co.uk
Mon Oct 23 21:12:55 UTC 2017


Quoting Chris Wilson (2017-10-23 21:06:16)
> Back in commit a4b2b01523a8 ("drm/i915: Don't mark an execlists
> context-switch when idle") we noticed the presence of late
> context-switch interrupts. We were able to filter those out by looking
> at whether the ELSP remained active, but in commit beecec901790
> ("drm/i915/execlists: Preemption!") that became problematic as we now
> anticipate receiving a context-switch event for preemption while ELSP
> may be empty. To restore the spurious interrupt suppression, add a
> counter for the expected number of pending context-switches and skip if
> we do not need to handle this interrupt to make forward progress.

Looking at an example from
https://intel-gfx-ci.01.org/tree/drm-tip/Trybot_1299/
the common case is where we still get the interrupt after already
parsing the whole CSB:

<6>[   22.723238] i915 0000:00:02.0: [drm] vecs0
<6>[   22.723246] i915 0000:00:02.0: [drm] 	current seqno 8, last 8, hangcheck 0 [-277277 ms], inflight 0
<6>[   22.723260] i915 0000:00:02.0: [drm] 	Reset count: 0
<6>[   22.723269] i915 0000:00:02.0: [drm] 	Requests:
<6>[   22.723278] i915 0000:00:02.0: [drm] 	RING_START: 0x007fb000 [0x00000000]
<6>[   22.723289] i915 0000:00:02.0: [drm] 	RING_HEAD:  0x00000278 [0x00000000]
<6>[   22.723300] i915 0000:00:02.0: [drm] 	RING_TAIL:  0x00000278 [0x00000000]
<6>[   22.723311] i915 0000:00:02.0: [drm] 	RING_CTL:   0x00003001 []
<6>[   22.723322] i915 0000:00:02.0: [drm] 	ACTHD:  0x00000000_00000278
<6>[   22.723333] i915 0000:00:02.0: [drm] 	BBADDR: 0x00000000_00000004
<6>[   22.723343] i915 0000:00:02.0: [drm] 	Execlist status: 0x00000301 00000000
<6>[   22.723355] i915 0000:00:02.0: [drm] 	Execlist CSB read 1 [1 cached], write 1 [1 from hws], interrupt posted? no
<6>[   22.723370] i915 0000:00:02.0: [drm] 		ELSP[0] idle
<6>[   22.723378] i915 0000:00:02.0: [drm] 		ELSP[1] idle
<6>[   22.723387] i915 0000:00:02.0: [drm] 		HW active? 0x0
<6>[   22.723402] i915 0000:00:02.0: [drm] 


Those should not lead to hitting BUG_ON(gt.awake) though as the tasklet
is flushed before we clear gt.awake. Except if maybe the interrupt
arrives after the tasklet_kill...

Given that we wait for the engines to be idle before parking, we should
be safe enough with

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index bb0e85043e01..fa46137d431a 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -3327,6 +3327,8 @@ i915_gem_idle_work_handler(struct work_struct *work)
        if (new_requests_since_last_retire(dev_priv))
                goto out_unlock;
 
+       synchronize_irq(dev_priv->drm.irq);
+
        /*
         * We are committed now to parking the engines, make sure there
         * will be no more interrupts arriving later.

to flush a pending irq and not worry about a multi-phase park.
-Chris


More information about the Intel-gfx mailing list