[Intel-gfx] [PATCH 14/21] drm/i915: Only apply one barrier after a breadcrumb interrupt is posted

Wed Jun 8 09:35:00 UTC 2016

On Mon, Jun 06, 2016 at 04:34:27PM +0100, Tvrtko Ursulin wrote:
> 
> On 03/06/16 17:08, Chris Wilson wrote:
> >If we flag the seqno as potentially stale upon receiving an interrupt,
> >we can use that information to reduce the frequency that we apply the
> >heavyweight coherent seqno read (i.e. if we wake up a chain of waiters).
> >
> >Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
> >---
> >  drivers/gpu/drm/i915/i915_drv.h          | 15 ++++++++++++++-
> >  drivers/gpu/drm/i915/i915_irq.c          |  1 +
> >  drivers/gpu/drm/i915/intel_breadcrumbs.c | 16 ++++++++++------
> >  drivers/gpu/drm/i915/intel_ringbuffer.h  |  1 +
> >  4 files changed, 26 insertions(+), 7 deletions(-)
> >
> >diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> >index 4ddb9ff319cb..a71d08199d57 100644
> >--- a/drivers/gpu/drm/i915/i915_drv.h
> >+++ b/drivers/gpu/drm/i915/i915_drv.h
> >@@ -3935,7 +3935,20 @@ static inline bool __i915_request_irq_complete(struct drm_i915_gem_request *req)
> >  	 * but it is easier and safer to do it every time the waiter
> >  	 * is woken.
> >  	 */
> >-	if (engine->irq_seqno_barrier) {
> >+	if (engine->irq_seqno_barrier && READ_ONCE(engine->irq_posted)) {
> >+		/* The ordering of irq_posted versus applying the barrier
> >+		 * is crucial. The clearing of the current irq_posted must
> >+		 * be visible before we perform the barrier operation,
> >+		 * such that if a subsequent interrupt arrives, irq_posted
> >+		 * is reasserted and our task rewoken (which causes us to
> >+		 * do another __i915_request_irq_complete() immediately
> >+		 * and reapply the barrier). Conversely, if the clear
> >+		 * occurs after the barrier, then an interrupt that arrived
> >+		 * whilst we waited on the barrier would not trigger a
> >+		 * barrier on the next pass, and the read may not see the
> >+		 * seqno update.
> >+		 */
> >+		WRITE_ONCE(engine->irq_posted, false);
> 
> Why is this not smp_store_mb ?

We only require the ordering wrt to irq_seqno_barrier().

How about:

if (engine->irq_seqno_barrier &&
    cmpxchg_relaxed(&engine->irq_post, 1, 0)) {

Less shouty?

> >diff --git a/drivers/gpu/drm/i915/intel_breadcrumbs.c b/drivers/gpu/drm/i915/intel_breadcrumbs.c
> >index 44346de39794..0f5fe114c204 100644
> >--- a/drivers/gpu/drm/i915/intel_breadcrumbs.c
> >+++ b/drivers/gpu/drm/i915/intel_breadcrumbs.c
> >@@ -43,12 +43,18 @@ static void intel_breadcrumbs_fake_irq(unsigned long data)
> >
> >  static void irq_enable(struct intel_engine_cs *engine)
> >  {
> >+	/* Enabling the IRQ may miss the generation of the interrupt, but
> >+	 * we still need to force the barrier before reading the seqno,
> >+	 * just in case.
> >+	 */
> >+	engine->irq_posted = true;
> 
> Should it be smp_store_mb here as well?

No, this is written/read on the same callchain.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre