[Intel-gfx] [PATCH v5 4/4] drm/i915: Delay disabling the user interrupt for breadcrumbs

Chris Wilson chris at chris-wilson.co.uk
Mon Feb 27 14:13:37 UTC 2017


On Mon, Feb 27, 2017 at 02:06:58PM +0000, Chris Wilson wrote:
> On Mon, Feb 27, 2017 at 01:57:35PM +0000, Tvrtko Ursulin wrote:
> > 
> > On 27/02/2017 13:24, Chris Wilson wrote:
> > > 	if (b->hangcheck_interrupts != atomic_read(&engine->irq_count)) {
> > >@@ -67,7 +76,7 @@ static void intel_breadcrumbs_hangcheck(unsigned long data)
> > > 	 * to process the pending interrupt (e.g, low priority task on a loaded
> > > 	 * system) and wait until it sleeps before declaring a missed interrupt.
> > > 	 */
> > >-	if (intel_engine_wakeup(engine) & ENGINE_WAKEUP_ACTIVE) {
> > >+	if (!(intel_engine_wakeup(engine) & ENGINE_WAKEUP_ASLEEP)) {
> > 
> > I did not get the explanation from the previous round on why you had
> > to reverse the active to asleep. Here it even looks wrong now,
> > because I thought you don't want to re-queue the hangcheck when
> > there are no waiters?
> 
> No waiters: result = 0
> Running waiter: result = WAKEUP_WAITER
> Sleeping waiter: result = WAKEUP_WAITER | WAKEUP_ASLEEP
> 
> We only want to declare a mised irq if we wake up a sleeping waiter, and
> keep the hangcheck timer running until the device is idle (when the irq
> is disarmed).
> 
> How about:
> 
> if (intel_engine_wakeup(engine) & ENGINE_WAKEUP_ASLEEP) {
> 	set_bit(missed_irq);
> 	mod_timer(&b->fake_irq, jiffies + 1);
> } else {
> 	mod_timer(b->hangcheck, wait_timeout);
> }

diff --git a/drivers/gpu/drm/i915/intel_breadcrumbs.c b/drivers/gpu/drm/i915/intel_breadcrumbs.c
index ebb7bc0be9fb..004eb4c0c531 100644
--- a/drivers/gpu/drm/i915/intel_breadcrumbs.c
+++ b/drivers/gpu/drm/i915/intel_breadcrumbs.c
@@ -72,18 +72,25 @@ static void intel_breadcrumbs_hangcheck(unsigned long data)
                return;
        }
 
-       /* If the waiter was currently running, assume it hasn't had a chance
+       /* We keep the hangcheck time alive until we disarm the irq, even
+        * if there are no waiters at present.
+        *
+        * If the waiter was currently running, assume it hasn't had a chance
         * to process the pending interrupt (e.g, low priority task on a loaded
         * system) and wait until it sleeps before declaring a missed interrupt.
+        *
+        * If the waiter was asleep (and not even pending a wakeup), then we
+        * must have missed an interrupt as the GPU has stopped advancing
+        * but we still have a waiter. Assuming all batches complete within
+        * DRM_I915_HANGCHECK_JIFFIES [1.5s]!
         */
-       if (!(intel_engine_wakeup(engine) & ENGINE_WAKEUP_ASLEEP)) {
+       if (intel_engine_wakeup(engine) & ENGINE_WAKEUP_ASLEEP) {
+               DRM_DEBUG("Hangcheck timer elapsed... %s idle\n", engine->name);
+               set_bit(engine->id, &engine->i915->gpu_error.missed_irq_rings);
+               mod_timer(&engine->breadcrumbs.fake_irq, jiffies + 1);
+       } else {
                mod_timer(&b->hangcheck, wait_timeout());
-               return;
        }
-
-       DRM_DEBUG("Hangcheck timer elapsed... %s idle\n", engine->name);
-       set_bit(engine->id, &engine->i915->gpu_error.missed_irq_rings);
-       mod_timer(&engine->breadcrumbs.fake_irq, jiffies + 1);
 }
 
 static void intel_breadcrumbs_fake_irq(unsigned long data)
@@ -167,6 +174,10 @@ void intel_engine_disarm_breadcrumbs(struct intel_engine_cs *engine)
 
        spin_lock_irqsave(&b->lock, flags);
 
+       /* We only disarm the irq when we are idle (all requests completed),
+        * so if there remains a sleeping waiter, it missed the request
+        * completion.
+        */
        if (__intel_engine_wakeup(engine) & ENGINE_WAKEUP_ASLEEP)
                set_bit(engine->id, &engine->i915->gpu_error.missed_irq_rings);



-- 
Chris Wilson, Intel Open Source Technology Centre


More information about the Intel-gfx mailing list