[Intel-gfx] [PATCH 02/10] drm/i915: Adjust PM QoS response frequency based on GPU load.

Chris Wilson chris at chris-wilson.co.uk
Wed Mar 11 10:21:27 UTC 2020


Quoting Tvrtko Ursulin (2020-03-11 10:00:41)
> 
> On 10/03/2020 22:26, Chris Wilson wrote:
> > Quoting Francisco Jerez (2020-03-10 21:41:55)
> >>   static inline void
> >> @@ -2386,6 +2397,9 @@ static void process_csb(struct intel_engine_cs *engine)
> >>                          /* port0 completed, advanced to port1 */
> >>                          trace_ports(execlists, "completed", execlists->active);
> >>   
> >> +                       if (atomic_xchg(&execlists->overload, 0))
> >> +                               intel_gt_pm_active_end(&engine->i915->gt);
> > 
> > So this looses track if we preempt a dual-ELSP submission with a
> > single-ELSP submission (and never go back to dual).
> > 
> > If you move this to the end of the loop and check
> > 
> > if (!execlists->active[1] && atomic_xchg(&execlists->overload, 0))
> >       intel_gt_pm_active_end(engine->gt);
> > 
> > so that it covers both preemption/promotion and completion.
> > 
> > However, that will fluctuate quite rapidly. (And runs the risk of
> > exceeding the sentinel.)
> > 
> > An alternative approach would be to couple along
> > schedule_in/schedule_out
> > 
> > atomic_set(overload, -1);
> > 
> > __execlists_schedule_in:
> >       if (!atomic_fetch_inc(overload)
> >               intel_gt_pm_active_begin(engine->gt);
> > __execlists_schedule_out:
> >       if (!atomic_dec_return(overload)
> >               intel_gt_pm_active_end(engine->gt);
> > 
> > which would mean we are overloaded as soon as we try to submit an
> > overlapping ELSP.
> 
> Putting it this low-level into submission code also would not work well 
> with GuC.

We can cross that bridge when it is built. [The GuC is also likely to
not want to play with us anyway, and just use SLPC.]

Now, I suspect we may want to use an engine utilisation (busy-stats or
equivalent) metric, but honestly if we can finally land this work it
brings huge benefit for GPU bound TDP constrained workloads. (p-state
loves to starve the GPU even when it provides no extra benefit for the
CPU.) We can raise the bar, establish expected behaviour and then work
to maintain and keep on improving.
-Chris


More information about the Intel-gfx mailing list