[Intel-gfx] [PATCH 3/3] drm/i915: Defer final intel_wakeref_put to process context
Mika Kuoppala
mika.kuoppala at linux.intel.com
Thu Aug 8 13:49:40 UTC 2019
Chris Wilson <chris at chris-wilson.co.uk> writes:
> Quoting Mika Kuoppala (2019-08-07 16:04:56)
>> Chris Wilson <chris at chris-wilson.co.uk> writes:
>> > -static int intel_gt_park(struct intel_wakeref *wf)
>> > +static int __gt_park(struct intel_wakeref *wf)
>> > {
>> > struct drm_i915_private *i915 =
>> > container_of(wf, typeof(*i915), gt.wakeref);
>> > @@ -74,22 +67,25 @@ static int intel_gt_park(struct intel_wakeref *wf)
>> > if (INTEL_GEN(i915) >= 6)
>> > gen6_rps_idle(i915);
>> >
>> > + /* Everything switched off, flush any residual interrupt just in case */
>> > + intel_synchronize_irq(i915);
>>
>> Do we detect it gracefully if we get one extra afer this flush?
>
> If we make a mistake, we get unclaimed mmio from the interrupt handler.
>
> Given the structure of engine_pm/gt_pm, we should not be generating
> either user interrupts nor CS interrupts by this point. The potential is
> for the guc/pm interrupts, but I felt it was more prudent to document
> this is the final point for GT interrupts of any type.
>
>> > + GEM_BUG_ON(intel_gt_pm_is_awake(gt));
>> > + for_each_engine(engine, gt->i915, id) {
>> > + const typeof(*igt_atomic_phases) *p;
>> > +
>> > + for (p = igt_atomic_phases; p->name; p++) {
>> > + /*
>> > + * Acquisition is always synchronous, except if we
>> > + * know that the engine is already awale, in which
>>
>> s/awale/awake
>
> a whale
>
>> in which case?
>
> Avast ye blows!
> Moby Dick!
>
>> > static long
>> > wait_for_timelines(struct drm_i915_private *i915,
>> > unsigned int flags, long timeout)
>> > @@ -954,27 +941,20 @@ int i915_gem_wait_for_idle(struct drm_i915_private *i915,
>> > unsigned int flags, long timeout)
>> > {
>> > /* If the device is asleep, we have no requests outstanding */
>> > - if (!READ_ONCE(i915->gt.awake))
>> > + if (!intel_gt_pm_is_awake(&i915->gt))
>> > return 0;
>> >
>> > - GEM_TRACE("flags=%x (%s), timeout=%ld%s, awake?=%s\n",
>> > + GEM_TRACE("flags=%x (%s), timeout=%ld%s\n",
>> > flags, flags & I915_WAIT_LOCKED ? "locked" : "unlocked",
>> > - timeout, timeout == MAX_SCHEDULE_TIMEOUT ? " (forever)" : "",
>> > - yesno(i915->gt.awake));
>> > + timeout, timeout == MAX_SCHEDULE_TIMEOUT ? " (forever)" : "");
>> >
>> > timeout = wait_for_timelines(i915, flags, timeout);
>> > if (timeout < 0)
>> > return timeout;
>> >
>> > if (flags & I915_WAIT_LOCKED) {
>> > - int err;
>> > -
>> > lockdep_assert_held(&i915->drm.struct_mutex);
>> >
>> > - err = wait_for_engines(&i915->gt);
>> > - if (err)
>> > - return err;
>> > -
>>
>> Hmm where does the implicit wait for idle come from now?
>
> Instead of having the wait here, we moved it into the engine/gt pm
> tracking and added an intel_gt_pm_wait_for_idle().
I see that we do wait on switching to kernel context. However
still failing to grasp the way we end up waiting on (implicit?)
releasing of the gt pm (and where)
-Mika
>
>> > -int __intel_wakeref_put_last(struct intel_runtime_pm *rpm,
>> > - struct intel_wakeref *wf,
>> > - int (*fn)(struct intel_wakeref *wf))
>> > +static void ____intel_wakeref_put_last(struct intel_wakeref *wf)
>> > {
>> > - int err;
>> > + if (!atomic_dec_and_test(&wf->count))
>> > + goto unlock;
>> > +
>> > + if (likely(!wf->ops->put(wf))) {
>> > + rpm_put(wf);
>> > + wake_up_var(&wf->wakeref);
>> > + } else {
>> > + /* ops->put() must schedule its own release on deferral */
>> > + atomic_set_release(&wf->count, 1);
>>
>> I lose track here. On async call to gt_park we end up leaving
>> with a count 1.
>
> So we know wf->count is 0 and that we hold the lock preventing anyone
> else raising wf->count back to 1. If we fail in our cleanup task,
> knowing that we have exclusive control over the counter, we simply mark
> it as 1 again (using _release to unlock anyone concurrently trying to
> acquire the wakeref for themselves).
>
>> > + /* Assume we are not in process context and so cannot sleep. */
>> > + if (wf->ops->flags & INTEL_WAKEREF_PUT_ASYNC ||
>> > + !mutex_trylock(&wf->mutex)) {
>>
>> Didn't found anything in trylock that would prevent you
>> from shooting you on your own leg.
>
> Yup. It's immorally equivalent to spin_trylock.
>
>> So it is up to caller (and eventually lockdep spam) if
>> the async usage fails to follow the rules?
>
> Not the caller, the fault lies in the wakeref. We either mark it that is
> has to use the worker (process context) or fix it such that it is quick
> and can run in atomic context. engine_pm was simple enough to make
> atomic, gt_pm is promising but I needed to make rps waitfree (done) and
> make the display pm truly async (which I baulked at!).
>
>>
>> > + schedule_work(&wf->work);
>> > + return;
>> > + }
>> > +
>> > + ____intel_wakeref_put_last(wf);
>> > }
>> >
>> > -void __intel_wakeref_init(struct intel_wakeref *wf, struct lock_class_key *key)
>> > +static void __intel_wakeref_put_work(struct work_struct *wrk)
>>
>> No need for underscore before the name here?
>
> I was just trying to keep it consistent with the 3 functions working
> together.
>
> __intel_wakeref_put_last
> __intel_wakeref_put_work
> ____intel_wakeref_put_last
>
> Now, could use
>
> __intel_wakeref_put_last
> __wakeref_put_work
> ____wakeref_put_last
>
> keeping the underscores to underline the games being played with locking
> are not for the faint of heart.
> -Chris
More information about the Intel-gfx
mailing list