[Intel-gfx] [PATCH 2/2] drm/i915: Delay disabling the user interrupt for breadcrumbs

Wed Feb 22 17:50:32 UTC 2017

On Wed, Feb 22, 2017 at 05:26:34PM +0000, Tvrtko Ursulin wrote:
> 
> On 17/02/2017 15:51, Chris Wilson wrote:
> >A significant cost in setting up a wait is the overhead of enabling the
> >interrupt. As we disable the interrupt whenever the queue of waiters is
> >empty, if we are frequently waiting on alternating batches, we end up
> >re-enabling the interrupt on a frequent basis. We do want to disable the
> >interrupt during normal operations as under high load it may add several
> >thousand interrupts/s - we have been known in the past to occupy whole
> >cores with our interrupt handler after accidentally leaving user
> >interrupts enabled. As a compromise, leave the interrupt enabled until
> >the next IRQ, or the system is idle. This gives a small window for a
> >waiter to keep the interrupt active and not be delayed by having to
> >re-enable the interrupt.
> >
> >Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
> >Cc: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
> >Cc: Mika Kuoppala <mika.kuoppala at linux.intel.com>
> >---
> > drivers/gpu/drm/i915/i915_gem.c          |  4 +-
> > drivers/gpu/drm/i915/i915_irq.c          |  5 +-
> > drivers/gpu/drm/i915/intel_breadcrumbs.c | 95 +++++++++++++++++++-------------
> > drivers/gpu/drm/i915/intel_ringbuffer.h  |  4 +-
> > 4 files changed, 65 insertions(+), 43 deletions(-)
> >
> >diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> >index 6745dcbf3799..9c87aacce43b 100644
> >--- a/drivers/gpu/drm/i915/i915_gem.c
> >+++ b/drivers/gpu/drm/i915/i915_gem.c
> >@@ -3003,8 +3003,10 @@ i915_gem_idle_work_handler(struct work_struct *work)
> > 	if (wait_for(intel_execlists_idle(dev_priv), 10))
> > 		DRM_ERROR("Timeout waiting for engines to idle\n");
> >
> >-	for_each_engine(engine, dev_priv, id)
> >+	for_each_engine(engine, dev_priv, id) {
> >+		intel_engine_disarm_breadcrumbs(engine);
> > 		i915_gem_batch_pool_fini(&engine->batch_pool);
> >+	}
> >
> > 	GEM_BUG_ON(!dev_priv->gt.awake);
> > 	dev_priv->gt.awake = false;
> >diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
> >index 0c370c687c2a..fa597a29bc1d 100644
> >--- a/drivers/gpu/drm/i915/i915_irq.c
> >+++ b/drivers/gpu/drm/i915/i915_irq.c
> >@@ -1036,9 +1036,6 @@ static void notify_ring(struct intel_engine_cs *engine)
> > 	struct drm_i915_gem_request *rq = NULL;
> > 	struct intel_wait *wait;
> >
> >-	if (!intel_engine_has_waiter(engine))
> >-		return;
> >-
> > 	trace_i915_gem_request_notify(engine);
> > 	atomic_inc(&engine->irq_count);
> > 	set_bit(ENGINE_IRQ_BREADCRUMB, &engine->irq_posted);
> >@@ -1064,6 +1061,8 @@ static void notify_ring(struct intel_engine_cs *engine)
> > 			rq = wait->request;
> >
> > 		wake_up_process(wait->tsk);
> >+	} else {
> >+		__intel_engine_disarm_breadcrumbs(engine);
> 
> So we disarm on the irq following the waiter exiting. But the
> overall effect depends on the timings of the wake up thread getting
> scheduled in combined with batch duration which I am not sure I
> quite like.

Correct, and also the overhead of each interrupt depends on the
frequency of batches. Considering this should scale to 2M requests/s,
using a timer is not a good heuristic either. (And don't treat 2M as an
upper bound :)

> >+void __intel_engine_disarm_breadcrumbs(struct intel_engine_cs *engine)
> >+{
> >+	struct intel_breadcrumbs *b = &engine->breadcrumbs;
> >+
> >+	assert_spin_locked(&b->lock);
> >+
> >+	if (b->irq_enabled) {
> >+		irq_disable(engine);
> >+		b->irq_enabled = false;
> >+	}
> >+
> >+	b->irq_armed = false;
> 
> Do we need the two level, armed & enabled scheme? I mean, why not
> just have irq_enabled and disable it from the timer callback? It is
> a bit challenging to figure out how the two level thing works.

The 2 level stuff remains because of the fault injection. The current
timer callback is far too slow imo, it will have the effect of always
keeping the irq alive. That's not nice for older gen. I'm continually
surprised by how well bdw/skl cope with the excess of interrupts from
execlists.

> Would a simpler scheme work where we would just bump the irq_enabled
> counter to N on every waiter entering, decrement by one in each user
> interrupt, and finally turn off on reaching zero? Maybe it would be
> too costly in terms of useless wake-ups. Hm, not sure.

Set N to 1... :)

> >@@ -707,14 +724,16 @@ void intel_engine_reset_breadcrumbs(struct intel_engine_cs *engine)
> > 	cancel_fake_irq(engine);
> > 	spin_lock_irq(&b->lock);
> >
> >-	__intel_breadcrumbs_disable_irq(b);
> >+	if (b->irq_enabled)
> >+		irq_enable(engine);
> 
> Can't hit that GEM_BUG_ON(b->irq_enabled) in
> __intel_breadcrumbs_enable_irq from here?

We bypass __intel_breadcrumbs_enable_irq as the irq is already armed.
Sorry, missing the point you want to raise.

> >+	else
> >+		irq_disable(engine);
> >+
> > 	if (intel_engine_has_waiter(engine)) {
> >-		__intel_breadcrumbs_enable_irq(b);
> > 		if (test_bit(ENGINE_IRQ_BREADCRUMB, &engine->irq_posted))
> > 			wake_up_process(b->first_wait->tsk);
> >-	} else {
> >+		mod_timer(&b->hangcheck, wait_timeout());
> 
> If there are no waiters why schedule the timer? It won't do anything
> anyway AFAICS.

It's in the if (intel_engine_has_waiter) { } block. Cruel diff.

> I need to give this some more thought tomorrow.

Principal user in my head will be the guc scheduler. Clients throttling
themselves (such as mesa) won't see the irq kept active between frames.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre