[Intel-gfx] [PATCH v5 4/4] drm/i915: Delay disabling the user interrupt for breadcrumbs
Tvrtko Ursulin
tvrtko.ursulin at linux.intel.com
Mon Feb 27 19:05:59 UTC 2017
On 27/02/2017 14:13, Chris Wilson wrote:
> On Mon, Feb 27, 2017 at 02:06:58PM +0000, Chris Wilson wrote:
>> On Mon, Feb 27, 2017 at 01:57:35PM +0000, Tvrtko Ursulin wrote:
>>>
>>> On 27/02/2017 13:24, Chris Wilson wrote:
>>>> if (b->hangcheck_interrupts != atomic_read(&engine->irq_count)) {
>>>> @@ -67,7 +76,7 @@ static void intel_breadcrumbs_hangcheck(unsigned long data)
>>>> * to process the pending interrupt (e.g, low priority task on a loaded
>>>> * system) and wait until it sleeps before declaring a missed interrupt.
>>>> */
>>>> - if (intel_engine_wakeup(engine) & ENGINE_WAKEUP_ACTIVE) {
>>>> + if (!(intel_engine_wakeup(engine) & ENGINE_WAKEUP_ASLEEP)) {
>>>
>>> I did not get the explanation from the previous round on why you had
>>> to reverse the active to asleep. Here it even looks wrong now,
>>> because I thought you don't want to re-queue the hangcheck when
>>> there are no waiters?
>>
>> No waiters: result = 0
>> Running waiter: result = WAKEUP_WAITER
>> Sleeping waiter: result = WAKEUP_WAITER | WAKEUP_ASLEEP
>>
>> We only want to declare a mised irq if we wake up a sleeping waiter, and
>> keep the hangcheck timer running until the device is idle (when the irq
>> is disarmed).
>>
>> How about:
>>
>> if (intel_engine_wakeup(engine) & ENGINE_WAKEUP_ASLEEP) {
>> set_bit(missed_irq);
>> mod_timer(&b->fake_irq, jiffies + 1);
>> } else {
>> mod_timer(b->hangcheck, wait_timeout);
>> }
>
> diff --git a/drivers/gpu/drm/i915/intel_breadcrumbs.c b/drivers/gpu/drm/i915/intel_breadcrumbs.c
> index ebb7bc0be9fb..004eb4c0c531 100644
> --- a/drivers/gpu/drm/i915/intel_breadcrumbs.c
> +++ b/drivers/gpu/drm/i915/intel_breadcrumbs.c
> @@ -72,18 +72,25 @@ static void intel_breadcrumbs_hangcheck(unsigned long data)
> return;
> }
>
> - /* If the waiter was currently running, assume it hasn't had a chance
> + /* We keep the hangcheck time alive until we disarm the irq, even
> + * if there are no waiters at present.
> + *
> + * If the waiter was currently running, assume it hasn't had a chance
> * to process the pending interrupt (e.g, low priority task on a loaded
> * system) and wait until it sleeps before declaring a missed interrupt.
> + *
> + * If the waiter was asleep (and not even pending a wakeup), then we
> + * must have missed an interrupt as the GPU has stopped advancing
> + * but we still have a waiter. Assuming all batches complete within
> + * DRM_I915_HANGCHECK_JIFFIES [1.5s]!
> */
> - if (!(intel_engine_wakeup(engine) & ENGINE_WAKEUP_ASLEEP)) {
> + if (intel_engine_wakeup(engine) & ENGINE_WAKEUP_ASLEEP) {
> + DRM_DEBUG("Hangcheck timer elapsed... %s idle\n", engine->name);
> + set_bit(engine->id, &engine->i915->gpu_error.missed_irq_rings);
> + mod_timer(&engine->breadcrumbs.fake_irq, jiffies + 1);
> + } else {
> mod_timer(&b->hangcheck, wait_timeout());
> - return;
> }
> -
> - DRM_DEBUG("Hangcheck timer elapsed... %s idle\n", engine->name);
> - set_bit(engine->id, &engine->i915->gpu_error.missed_irq_rings);
> - mod_timer(&engine->breadcrumbs.fake_irq, jiffies + 1);
> }
>
> static void intel_breadcrumbs_fake_irq(unsigned long data)
> @@ -167,6 +174,10 @@ void intel_engine_disarm_breadcrumbs(struct intel_engine_cs *engine)
>
> spin_lock_irqsave(&b->lock, flags);
>
> + /* We only disarm the irq when we are idle (all requests completed),
> + * so if there remains a sleeping waiter, it missed the request
> + * completion.
> + */
> if (__intel_engine_wakeup(engine) & ENGINE_WAKEUP_ASLEEP)
> set_bit(engine->id, &engine->i915->gpu_error.missed_irq_rings);
>
>
>
Looks okay after some re-re-re-reading. I guess shuffling things back
and forth in this series, not long after the area was recently changed
as well, was a bit challenging.
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
Regards,
Tvrtko
More information about the Intel-gfx
mailing list