[Intel-gfx] [PATCH] drm/i915/execlists: Move the assertion we have the rpm wakeref down

Chris Wilson chris at chris-wilson.co.uk
Thu Jul 19 12:23:10 UTC 2018


Quoting Tvrtko Ursulin (2018-07-19 13:14:38)
> 
> On 19/07/2018 12:59, Chris Wilson wrote:
> > Quoting Tvrtko Ursulin (2018-07-19 12:49:13)
> >>
> >> On 19/07/2018 08:50, Chris Wilson wrote:
> >>> There's a race between idling the engine and finishing off the last
> >>> tasklet (as we may kick the tasklets after declaring an individual
> >>> engine idle). However, since we do not need to access the device until
> >>> we try to submit to the ELSP register (processing the CSB just requires
> >>> normal CPU access to the HWSP, and when idle we should not need to
> >>> submit!) we can defer the assertion unto that point. The assertion is
> >>> still useful as it does verify that we do hold the longterm GT wakeref
> >>> taken from request allocation until request completion.
> >>>
> >>> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107274
> >>> Fixes: 9512f985c32d ("drm/i915/execlists: Direct submission of new requests (avoid tasklet/ksoftirqd)")
> >>> Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
> >>> Cc: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
> >>> ---
> >>>    drivers/gpu/drm/i915/intel_lrc.c | 25 +++++++++++--------------
> >>>    1 file changed, 11 insertions(+), 14 deletions(-)
> >>>
> >>> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> >>> index db5351e6a3a5..9d693e61536c 100644
> >>> --- a/drivers/gpu/drm/i915/intel_lrc.c
> >>> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> >>> @@ -452,6 +452,16 @@ static void execlists_submit_ports(struct intel_engine_cs *engine)
> >>>        struct execlist_port *port = execlists->port;
> >>>        unsigned int n;
> >>>    
> >>> +     /*
> >>> +      * We can skip acquiring intel_runtime_pm_get() here as it was taken
> >>> +      * on our behalf by the request (see i915_gem_mark_busy()) and it will
> >>> +      * not be relinquished until the device is idle (see
> >>> +      * i915_gem_idle_work_handler()). As a precaution, we make sure
> >>> +      * that all ELSP are drained i.e. we have processed the CSB,
> >>> +      * before allowing ourselves to idle and calling intel_runtime_pm_put().
> >>> +      */
> >>> +     GEM_BUG_ON(!engine->i915->gt.awake);
> >>> +
> >>>        /*
> >>>         * ELSQ note: the submit queue is not cleared after being submitted
> >>>         * to the HW so we need to make sure we always clean it up. This is
> >>> @@ -1043,16 +1053,6 @@ static void __execlists_submission_tasklet(struct intel_engine_cs *const engine)
> >>>    {
> >>>        lockdep_assert_held(&engine->timeline.lock);
> >>>    
> >>> -     /*
> >>> -      * We can skip acquiring intel_runtime_pm_get() here as it was taken
> >>> -      * on our behalf by the request (see i915_gem_mark_busy()) and it will
> >>> -      * not be relinquished until the device is idle (see
> >>> -      * i915_gem_idle_work_handler()). As a precaution, we make sure
> >>> -      * that all ELSP are drained i.e. we have processed the CSB,
> >>> -      * before allowing ourselves to idle and calling intel_runtime_pm_put().
> >>> -      */
> >>> -     GEM_BUG_ON(!engine->i915->gt.awake);
> >>> -
> >>>        process_csb(engine);
> >>>        if (!execlists_is_active(&engine->execlists, EXECLISTS_ACTIVE_PREEMPT))
> >>>                execlists_dequeue(engine);
> >>> @@ -1073,10 +1073,7 @@ static void execlists_submission_tasklet(unsigned long data)
> >>>                  engine->execlists.active);
> >>>    
> >>>        spin_lock_irqsave(&engine->timeline.lock, flags);
> >>> -
> >>> -     if (engine->i915->gt.awake) /* we may be delayed until after we idle! */
> >>> -             __execlists_submission_tasklet(engine);
> >>> -
> >>> +     __execlists_submission_tasklet(engine);
> >>>        spin_unlock_irqrestore(&engine->timeline.lock, flags);
> >>>    }
> >>>    
> >>>
> >>
> >> Why we won't hit the assert on the elsp submit side now? AFAIR the
> >> discussion about this particular line concluded that direct tasklet call
> >> can race with busy->idle transition. So even is process_csb doens't need
> >> the assert, that part I get, the part about the race now again bothers
> >> me. Perhaps I just forgot what I thought I understood back then.. :(
> > 
> > Same race, I just didn't think it through that it could change within
> > the space of a couple of lines. :|
> > 
> >> Should this call process_csb only if !gt.awake? But sounds terribly
> >> dodgy.. Why would execlists.active be set if we are not awake..
> > 
> > Have to remember it's i915->gt.awake no execlists->active (that's what I
> > briefly hoped for...) I looked at ways we might decouple the tasklet (we
> > can't just use tasklet_disable) but that looks overkill, and I can't see
> > any way we can guarantee that we won't randomly kick it later.
> > 
> > I did consider reordering the final wait_for(engine_is_idle()) and
> > tasklet_flush, but I couldn't convince myself that was enough to
> > guarantee we wouldn't get an even later interrupt to kick the tasklet.
> > 
> > So left it at just using the CSB to filter out the spurious call, and
> > relying on that we are idle to avoid the submit. The assertion still
> > plays the same role as it always has done, making sure the actual
> > register access is covered by our wakeref.
> 
> So the thing I am thinking of, and you tell me if it is not the one:
> 
> 1. Idle work handler runs
> 2. Goes to idle the engines
> 3.     New request comes in from a separate thread
> 4. intel_engine_is_idle sees execlist.active is set
> 5. Calls the tasklet
> 
> But gt.awake is true at this point, so assert does not fire.
> 
> If the status of execlists.active changes from true to false between the 
> check in 4 and tasklet kick in 5. Okay, but in this case how did we pass 
> the the gt.awake and gt.active_requests checks at the top of the idle 
> work handler?

active_requests is protected by struct_mutex, so stable for idle_worker
deciding to set gt.awake=false.

The assert we have is all about gt.awake and the theory was that after
parking the engine we wouldn't get another tasklet. That's the theory
full of holes.

> The whole state change has to happen in the middle of idle work handler 
> I guess. Transition from !awake to awake before the 
> intel_engines_are_idle_loop, and actually process the csb in the middle 
> of the intel_engine_is_idle checks.
> 
> Okay, think I answered my own question. My mindset was to serialized in 
> this instance. :)

Same answer, so yup. The challenge being getting that serialisation with
the interrupt handler solid. Or just be prepared that we get the
occasional late tasklet_schedule and let it nop away.
-Chris


More information about the Intel-gfx mailing list