[Intel-gfx] [PATCH 04/22] drm/i915: Guard unpark/park with the gt.active_mutex

Mon Apr 1 16:38:41 UTC 2019

Quoting Tvrtko Ursulin (2019-04-01 16:54:53)
> 
> On 01/04/2019 16:37, Chris Wilson wrote:
> > Quoting Tvrtko Ursulin (2019-04-01 16:22:55)
> >>
> >> On 25/03/2019 09:03, Chris Wilson wrote:
> >>> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> >>> index 11803d485275..7c7afe99986c 100644
> >>> --- a/drivers/gpu/drm/i915/i915_drv.h
> >>> +++ b/drivers/gpu/drm/i915/i915_drv.h
> >>> @@ -2008,7 +2008,8 @@ struct drm_i915_private {
> >>>                intel_engine_mask_t active_engines;
> >>>                struct list_head active_rings;
> >>>                struct list_head closed_vma;
> >>> -             u32 active_requests;
> >>> +             atomic_t active_requests;
> >>> +             struct mutex active_mutex;
> >>
> >> My initial reaction is why not gem_pm_mutex to match where code was
> >> moved and what the commit message says?
> > 
> > Because we are inheriting the name from active_requests. If we renane
> > that to active_count, and maybe pm_wake_count?
> > 
> > pm_active_count?
> 
> pm_wake_count sounds good from the PM angle. But it's then wrong for all 
> places which query active requests. Don't know.
> 
> > 
> >>> +     mutex_lock(&i915->gt.active_mutex);
> >>>        if (!work_pending(&i915->gt.idle_work.work) &&
> >>> -         !i915->gt.active_requests) {
> >>> -             ++i915->gt.active_requests; /* don't requeue idle */
> >>> +         !atomic_read(&i915->gt.active_requests)) {
> >>> +             atomic_inc(&i915->gt.active_requests); /* don't requeue idle */
> >>
> >> atomic_inc_not_zero?
> > 
> > atomicity of the read & inc are not strictly required, once
> > active_requests is zero it cannot be raised without holding
> > active_mutex.
> 
> Yeah my bad. atomic_inc_not_zero would be wrong, it is the opposite. You 
> would need atomic_inc_if_zero here.
> 
> > 
> >>> @@ -191,11 +183,29 @@ void i915_gem_unpark(struct drm_i915_private *i915)
> >>>                           round_jiffies_up_relative(HZ));
> >>>    }
> >>>    
> >>> +void i915_gem_unpark(struct drm_i915_private *i915)
> >>> +{
> >>> +     if (atomic_add_unless(&i915->gt.active_requests, 1, 0))
> >>> +             return;
> >>
> >> This looks wrong - how can it be okay to maybe not increment
> >> active_requests on unpark? What am I missing?
> > 
> > If the add succeeds, active_requests was non-zero, and we can skip waking
> > up the device. If the add fails, active_requests might be zero, so we
> > take the mutex and check.
> 
> True true.. so this one is atomic_inc_not_zero! :)

I had forgotten about that one!

> >> I would expect here you would need
> >> "atomic_inc_and_return_true_if_OLD_value_was_zero" but I don't think
> >> there is such API.
> >>
> >>> +
> >>> +     mutex_lock(&i915->gt.active_mutex);
> >>> +     if (!atomic_read(&i915->gt.active_requests)) {
> >>> +             GEM_TRACE("\n");
> >>> +             __i915_gem_unpark(i915);
> >>> +             smp_mb__before_atomic();
> >>
> >> Why is this needed? I have no idea.. but I think we want comments with
> >> all memory barriers.
> > 
> > Because atomic_inc() is not regarded as a strong barrier, so we turn it
> > into one. (In documentation at least, on x86 it's just a compiler
> > barrier().)
> 
> Document why we need it please.

Imagine if there was an atomic_inc_unlock() or atomic_inc_release(). I
suppose atomic_add_return_release(), with the caveat that we only need
the release if we actually did the deed.

> >>> +     }
> >>> +     atomic_inc(&i915->gt.active_requests);
> >>> +     mutex_unlock(&i915->gt.active_mutex);
> >>> +}
> >>> +
> >>>    bool i915_gem_load_power_context(struct drm_i915_private *i915)
> >>>    {
> >>> +     mutex_lock(&i915->gt.active_mutex);
> >>> +     atomic_inc(&i915->gt.active_requests);
> >>
> >> Why this function has to manually manage active_requests? Can it be
> >> written in a simpler way?
> > 
> > It's so that this function has control over parking.
> > 
> > This is the first request, this is the place where we will make the
> > explicit calls to set up the powersaving contexts and state -- currently,
> > it's implicit on the first request.
> 
> Why does it need control over parking? Does it need a comment perhaps? :)

There is a comment! See "Immediately park the GPU..."

> >>>        /* Force loading the kernel context on all engines */
> >>>        if (!switch_to_kernel_context_sync(i915, ALL_ENGINES))
> >>> -             return false;
> >>> +             goto err_active;
> >>>    
> >>>        /*
> >>>         * Immediately park the GPU so that we enable powersaving and
-Chris