[Intel-gfx] [PATCH 03/19] drm/i915/gt: Close race between engine_park and intel_gt_retire_requests

Wed Nov 20 11:51:35 UTC 2019

Quoting Tvrtko Ursulin (2019-11-20 11:39:00)
> 
> On 19/11/2019 14:41, Chris Wilson wrote:
> > Quoting Tvrtko Ursulin (2019-11-19 14:15:49)
> >>
> >> On 18/11/2019 23:02, Chris Wilson wrote:
> >>> The general concept was that intel_timeline.active_count was locked by
> >>> the intel_timeline.mutex. The exception was for power management, where
> >>> the engine->kernel_context->timeline could be manipulated under the
> >>> global wakeref.mutex.
> >>>
> >>> This was quite solid, as we always manipulated the timeline only while
> >>> we held an engine wakeref.
> >>>
> >>> And then we started retiring requests outside of struct_mutex, only
> >>> using the timelines.active_list and the timeline->mutex. There we
> >>> started manipulating intel_timeline.active_count outside of an engine
> >>> wakeref, and so introduced a race between __engine_park() and
> >>> intel_gt_retire_requests(), a race that could result in the
> >>> engine->kernel_context not being added to the active timelines and so
> >>> losing requests, which caused us to keep the system permanently powered
> >>> up [and unloadable].
> >>>
> >>> The race would be easy to close if we could take the engine wakeref for
> >>> the timeline before we retire -- except timelines are not bound to any
> >>> engine and so we would need to keep all active engines awake. The
> >>> alternative is to guard intel_timeline_enter/intel_timeline_exit for use
> >>> outside of the timeline->mutex.
> >>>
> >>> Fixes: e5dadff4b093 ("drm/i915: Protect request retirement with timeline->mutex")
> >>> Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
> >>> Cc: Matthew Auld <matthew.auld at intel.com>
> >>> Cc: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
> >>> ---
> >>>    drivers/gpu/drm/i915/gt/intel_gt_requests.c   |  8 ++---
> >>>    drivers/gpu/drm/i915/gt/intel_timeline.c      | 34 +++++++++++++++----
> >>>    .../gpu/drm/i915/gt/intel_timeline_types.h    |  2 +-
> >>>    3 files changed, 32 insertions(+), 12 deletions(-)
> >>>
> >>> diff --git a/drivers/gpu/drm/i915/gt/intel_gt_requests.c b/drivers/gpu/drm/i915/gt/intel_gt_requests.c
> >>> index a79e6efb31a2..7559d6373f49 100644
> >>> --- a/drivers/gpu/drm/i915/gt/intel_gt_requests.c
> >>> +++ b/drivers/gpu/drm/i915/gt/intel_gt_requests.c
> >>> @@ -49,8 +49,8 @@ long intel_gt_retire_requests_timeout(struct intel_gt *gt, long timeout)
> >>>                        continue;
> >>>    
> >>>                intel_timeline_get(tl);
> >>> -             GEM_BUG_ON(!tl->active_count);
> >>> -             tl->active_count++; /* pin the list element */
> >>> +             GEM_BUG_ON(!atomic_read(&tl->active_count));
> >>> +             atomic_inc(&tl->active_count); /* pin the list element */
> >>>                spin_unlock_irqrestore(&timelines->lock, flags);
> >>>    
> >>>                if (timeout > 0) {
> >>> @@ -71,14 +71,14 @@ long intel_gt_retire_requests_timeout(struct intel_gt *gt, long timeout)
> >>>    
> >>>                /* Resume iteration after dropping lock */
> >>>                list_safe_reset_next(tl, tn, link);
> >>> -             if (!--tl->active_count)
> >>> +             if (atomic_dec_and_test(&tl->active_count))
> >>>                        list_del(&tl->link);
> >>>    
> >>>                mutex_unlock(&tl->mutex);
> >>>    
> >>>                /* Defer the final release to after the spinlock */
> >>>                if (refcount_dec_and_test(&tl->kref.refcount)) {
> >>> -                     GEM_BUG_ON(tl->active_count);
> >>> +                     GEM_BUG_ON(atomic_read(&tl->active_count));
> >>>                        list_add(&tl->link, &free);
> >>>                }
> >>>        }
> >>> diff --git a/drivers/gpu/drm/i915/gt/intel_timeline.c b/drivers/gpu/drm/i915/gt/intel_timeline.c
> >>> index 16a9e88d93de..4f914f0d5eab 100644
> >>> --- a/drivers/gpu/drm/i915/gt/intel_timeline.c
> >>> +++ b/drivers/gpu/drm/i915/gt/intel_timeline.c
> >>> @@ -334,15 +334,33 @@ void intel_timeline_enter(struct intel_timeline *tl)
> >>>        struct intel_gt_timelines *timelines = &tl->gt->timelines;
> >>>        unsigned long flags;
> >>>    
> >>> +     /*
> >>> +      * Pretend we are serialised by the timeline->mutex.
> >>> +      *
> >>> +      * While generally true, there are a few exceptions to the rule
> >>> +      * for the engine->kernel_context being used to manage power
> >>> +      * transitions. As the engine_park may be called from under any
> >>> +      * timeline, it uses the power mutex as a global serialisation
> >>> +      * lock to prevent any other request entering its timeline.
> >>> +      *
> >>> +      * The rule is generally tl->mutex, otherwise engine->wakeref.mutex.
> >>> +      *
> >>> +      * However, intel_gt_retire_request() does not know which engine
> >>> +      * it is retiring along and so cannot partake in the engine-pm
> >>> +      * barrier, and there we use the tl->active_count as a means to
> >>> +      * pin the timeline in the active_list while the locks are dropped.
> >>> +      * Ergo, as that is outside of the engine-pm barrier, we need to
> >>> +      * use atomic to manipulate tl->active_count.
> >>> +      */
> >>>        lockdep_assert_held(&tl->mutex);
> >>> -
> >>>        GEM_BUG_ON(!atomic_read(&tl->pin_count));
> >>> -     if (tl->active_count++)
> >>> +
> >>> +     if (atomic_add_unless(&tl->active_count, 1, 0))
> >>>                return;
> >>> -     GEM_BUG_ON(!tl->active_count); /* overflow? */
> >>>    
> >>>        spin_lock_irqsave(&timelines->lock, flags);
> >>> -     list_add(&tl->link, &timelines->active_list);
> >>> +     if (!atomic_fetch_inc(&tl->active_count))
> >>> +             list_add(&tl->link, &timelines->active_list);
> >>
> >> So retirement raced with this and has elevated the active_count? But
> >> retirement does not add the timeline to the list, so we exit here
> >> without it on the active_list.
> > 
> > Retirement only sees an element on the active_list. What we observed in
> > practice was the inc/dec on tl->active_count racing, causing
> > indeterminate results, with the result that we removed the element from
> > the active_list while it had a raised tl->active_count (due to the
> > inflight posting from the other CPU).
> > 
> > Thus we kept requests inflight and the engine awake with no way to clear
> > them. This most obvious triggered GEM_BUG_ON(gt->awake) during suspend,
> > and is also responsible for the timeouts on gem_quiescent_gpu() or
> > igt_drop_caches_set(DROP_IDLE).
> 
> I understand (I think) the race where retirement races with parking. But 
> this side of things (intel_timeline_enter) does not seem to be involved 
> in that. intel_timeline_enter is the only place which puts the timeline 
> onto the active_list. How can two of them race?

There should not be concurrent intel_timeline_enter(), or concurrent
intel_timeline_exit()/intel_timeline_enter(), as they are serialised by a
mixture of engine-pm or timeline->mutex. The complexity is from
intel_gt_retire_requests which does not have the engine wakeref and so
we have a potential data race between parking and retiring.

> And also, what remains to be purpose of timelines->lock?

It protects the list, which is how we find the timelines to retire
(before we can take the timeline->mutex).
-Chris