[Intel-gfx] [PATCH] drm/i915/execlists: Move the assertion we have the rpm wakeref down
Chris Wilson
chris at chris-wilson.co.uk
Thu Jul 19 12:23:10 UTC 2018
Quoting Tvrtko Ursulin (2018-07-19 13:14:38)
>
> On 19/07/2018 12:59, Chris Wilson wrote:
> > Quoting Tvrtko Ursulin (2018-07-19 12:49:13)
> >>
> >> On 19/07/2018 08:50, Chris Wilson wrote:
> >>> There's a race between idling the engine and finishing off the last
> >>> tasklet (as we may kick the tasklets after declaring an individual
> >>> engine idle). However, since we do not need to access the device until
> >>> we try to submit to the ELSP register (processing the CSB just requires
> >>> normal CPU access to the HWSP, and when idle we should not need to
> >>> submit!) we can defer the assertion unto that point. The assertion is
> >>> still useful as it does verify that we do hold the longterm GT wakeref
> >>> taken from request allocation until request completion.
> >>>
> >>> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107274
> >>> Fixes: 9512f985c32d ("drm/i915/execlists: Direct submission of new requests (avoid tasklet/ksoftirqd)")
> >>> Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
> >>> Cc: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
> >>> ---
> >>> drivers/gpu/drm/i915/intel_lrc.c | 25 +++++++++++--------------
> >>> 1 file changed, 11 insertions(+), 14 deletions(-)
> >>>
> >>> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> >>> index db5351e6a3a5..9d693e61536c 100644
> >>> --- a/drivers/gpu/drm/i915/intel_lrc.c
> >>> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> >>> @@ -452,6 +452,16 @@ static void execlists_submit_ports(struct intel_engine_cs *engine)
> >>> struct execlist_port *port = execlists->port;
> >>> unsigned int n;
> >>>
> >>> + /*
> >>> + * We can skip acquiring intel_runtime_pm_get() here as it was taken
> >>> + * on our behalf by the request (see i915_gem_mark_busy()) and it will
> >>> + * not be relinquished until the device is idle (see
> >>> + * i915_gem_idle_work_handler()). As a precaution, we make sure
> >>> + * that all ELSP are drained i.e. we have processed the CSB,
> >>> + * before allowing ourselves to idle and calling intel_runtime_pm_put().
> >>> + */
> >>> + GEM_BUG_ON(!engine->i915->gt.awake);
> >>> +
> >>> /*
> >>> * ELSQ note: the submit queue is not cleared after being submitted
> >>> * to the HW so we need to make sure we always clean it up. This is
> >>> @@ -1043,16 +1053,6 @@ static void __execlists_submission_tasklet(struct intel_engine_cs *const engine)
> >>> {
> >>> lockdep_assert_held(&engine->timeline.lock);
> >>>
> >>> - /*
> >>> - * We can skip acquiring intel_runtime_pm_get() here as it was taken
> >>> - * on our behalf by the request (see i915_gem_mark_busy()) and it will
> >>> - * not be relinquished until the device is idle (see
> >>> - * i915_gem_idle_work_handler()). As a precaution, we make sure
> >>> - * that all ELSP are drained i.e. we have processed the CSB,
> >>> - * before allowing ourselves to idle and calling intel_runtime_pm_put().
> >>> - */
> >>> - GEM_BUG_ON(!engine->i915->gt.awake);
> >>> -
> >>> process_csb(engine);
> >>> if (!execlists_is_active(&engine->execlists, EXECLISTS_ACTIVE_PREEMPT))
> >>> execlists_dequeue(engine);
> >>> @@ -1073,10 +1073,7 @@ static void execlists_submission_tasklet(unsigned long data)
> >>> engine->execlists.active);
> >>>
> >>> spin_lock_irqsave(&engine->timeline.lock, flags);
> >>> -
> >>> - if (engine->i915->gt.awake) /* we may be delayed until after we idle! */
> >>> - __execlists_submission_tasklet(engine);
> >>> -
> >>> + __execlists_submission_tasklet(engine);
> >>> spin_unlock_irqrestore(&engine->timeline.lock, flags);
> >>> }
> >>>
> >>>
> >>
> >> Why we won't hit the assert on the elsp submit side now? AFAIR the
> >> discussion about this particular line concluded that direct tasklet call
> >> can race with busy->idle transition. So even is process_csb doens't need
> >> the assert, that part I get, the part about the race now again bothers
> >> me. Perhaps I just forgot what I thought I understood back then.. :(
> >
> > Same race, I just didn't think it through that it could change within
> > the space of a couple of lines. :|
> >
> >> Should this call process_csb only if !gt.awake? But sounds terribly
> >> dodgy.. Why would execlists.active be set if we are not awake..
> >
> > Have to remember it's i915->gt.awake no execlists->active (that's what I
> > briefly hoped for...) I looked at ways we might decouple the tasklet (we
> > can't just use tasklet_disable) but that looks overkill, and I can't see
> > any way we can guarantee that we won't randomly kick it later.
> >
> > I did consider reordering the final wait_for(engine_is_idle()) and
> > tasklet_flush, but I couldn't convince myself that was enough to
> > guarantee we wouldn't get an even later interrupt to kick the tasklet.
> >
> > So left it at just using the CSB to filter out the spurious call, and
> > relying on that we are idle to avoid the submit. The assertion still
> > plays the same role as it always has done, making sure the actual
> > register access is covered by our wakeref.
>
> So the thing I am thinking of, and you tell me if it is not the one:
>
> 1. Idle work handler runs
> 2. Goes to idle the engines
> 3. New request comes in from a separate thread
> 4. intel_engine_is_idle sees execlist.active is set
> 5. Calls the tasklet
>
> But gt.awake is true at this point, so assert does not fire.
>
> If the status of execlists.active changes from true to false between the
> check in 4 and tasklet kick in 5. Okay, but in this case how did we pass
> the the gt.awake and gt.active_requests checks at the top of the idle
> work handler?
active_requests is protected by struct_mutex, so stable for idle_worker
deciding to set gt.awake=false.
The assert we have is all about gt.awake and the theory was that after
parking the engine we wouldn't get another tasklet. That's the theory
full of holes.
> The whole state change has to happen in the middle of idle work handler
> I guess. Transition from !awake to awake before the
> intel_engines_are_idle_loop, and actually process the csb in the middle
> of the intel_engine_is_idle checks.
>
> Okay, think I answered my own question. My mindset was to serialized in
> this instance. :)
Same answer, so yup. The challenge being getting that serialisation with
the interrupt handler solid. Or just be prepared that we get the
occasional late tasklet_schedule and let it nop away.
-Chris
More information about the Intel-gfx
mailing list