[Intel-gfx] [PATCH v2 1/3] drm/i915: Drop racy markup of missed-irqs from idle-worker
Tvrtko Ursulin
tvrtko.ursulin at linux.intel.com
Fri Jul 22 10:10:28 UTC 2016
On 21/07/16 12:04, Chris Wilson wrote:
> On Thu, Jul 21, 2016 at 11:28:02AM +0100, Tvrtko Ursulin wrote:
>> On 21/07/16 11:10, Chris Wilson wrote:
>>> On Thu, Jul 21, 2016 at 10:58:05AM +0100, Tvrtko Ursulin wrote:
>>>>
>>>> On 21/07/16 07:57, Chris Wilson wrote:
>>>>> During the idle-worker we disable the hangcheck and so kick any waiters
>>>>> that should have been completed (since the GPU is now idle). Unlike the
>>>>> hangcheck, we do not take any care to avoid the race between the irq
>>>>> handler and ourselves, and so it is possible for us to declare a missed
>>>>> interrupt even as the bottom-half is being scheduled to run. Let's
>>>>> ignore this race to stop a potential false-positive error.
>>>>
>>>> If the bottom half is scheduled to run then then..
>>>>
>>>>> References: https://bugs.freedesktop.org/show_bug.cgi?id=96974
>>>>> Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
>>>>> Cc: Joonas Lahtinen <joonas.lahtinen at linux.intel.com>
>>>>> Cc: Tvrtko Ursulin <tvrtko.ursulin at linux.intel.com>
>>>>> ---
>>>>> drivers/gpu/drm/i915/i915_gem.c | 7 +++----
>>>>> 1 file changed, 3 insertions(+), 4 deletions(-)
>>>>>
>>>>> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
>>>>> index 40047eb48826..9e826585edb2 100644
>>>>> --- a/drivers/gpu/drm/i915/i915_gem.c
>>>>> +++ b/drivers/gpu/drm/i915/i915_gem.c
>>>>> @@ -2706,10 +2706,9 @@ i915_gem_idle_work_handler(struct work_struct *work)
>>>>> rearm_hangcheck = false;
>>>>>
>>>>> stuck_engines = intel_kick_waiters(dev_priv);
>>>>
>>>> ... this will not return a stucked engine since the there is a bh
>>>> task assigned all until the bh exits.
>>>
>>> It reports if it wakes up a waiter on any engine. If the bh is already
>>> running, we cannot know if it has missed the seqno update. If it isn't
>>> running yet, we cannot know if it is about to be run.
>>
>> Oh I read the logic as completely opposite than what it is.
>>
>> Since the idle worker runs 100ms after last retirement, that would
>> mean a really slow waiter or what?
>
> It is dubious. But the idle worker runs 100ms after the first time we
> detect all engines are idle and may be running as we detect all engines
> are idle again. The only thing for sure is that in some cases that bdw-u
> is reaching the idle-worker with an unwoken engine (and that there is
> a race here in declaring it as a missed interrupt). I wasn't that
> concerned about the race because of that 100ms delay where eveything
> should have been idle, but on reflection that 100ms is not guarranteed.
Would canceling the idle worker be to expensive?
Either way, looks OK to me.
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
Regards,
Tvrtko
More information about the Intel-gfx
mailing list