[Intel-gfx] [PATCH 06/19] drm/i915/gt: Schedule request retirement when submission idles

Tue Nov 19 16:20:54 UTC 2019

Quoting Tvrtko Ursulin (2019-11-19 15:04:46)
> 
> On 18/11/2019 23:02, Chris Wilson wrote:
> > diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
> > index 33ce258d484f..f7c8fec436a9 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_lrc.c
> > +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
> > @@ -142,6 +142,7 @@
> >   #include "intel_engine_pm.h"
> >   #include "intel_gt.h"
> >   #include "intel_gt_pm.h"
> > +#include "intel_gt_requests.h"
> >   #include "intel_lrc_reg.h"
> >   #include "intel_mocs.h"
> >   #include "intel_reset.h"
> > @@ -2278,6 +2279,18 @@ static void execlists_submission_tasklet(unsigned long data)
> >               if (timeout && preempt_timeout(engine))
> >                       preempt_reset(engine);
> >       }
> > +
> > +     /*
> > +      * If the GPU is currently idle, retire the outstanding completed
> > +      * requests. This will allow us to enter soft-rc6 as soon as possible,
> > +      * albeit at the cost of running the retire worker much more frequently
> > +      * (over the entire GT not just this engine) and emitting more idle
> > +      * barriers (i.e. kernel context switches unpinning all that went
> > +      * before) which may add some extra latency.
> > +      */
> > +     if (intel_engine_pm_is_awake(engine) &&
> > +         !execlists_active(&engine->execlists))
> > +             intel_gt_schedule_retire_requests(engine->gt);
> 
> I am still not a fan of doing this for all platforms.

I understand. I think it makes a fair amount of sense to do early
retires, and wish to pursue that if I can show there is no harm.

> Its not just the cost of retirement but there is 
> intel_engine_flush_submission on all engines in there as well which we 
> cannot avoid triggering from this path.
> 
> Would it be worth experimenting with additional per-engine retire 
> workers? Most of the code could be shared, just a little bit of 
> specialization to filter on engine.

I haven't sketched out anything more than peeking at the last request on
the timeline and doing a rq->engine == engine filter. Walking the global
timeline.active_list in that case is also a nuisance.

There's definitely scope here for us using some more information from
process_csb() about which context completed and limit work to that
timeline. Hmm, something along those lines maybe...
-Chris