[PATCH 75/76] RFC drm/i915: Load balancing across a virtual engine

Wed Jun 6 10:48:03 UTC 2018

Quoting Tvrtko Ursulin (2018-06-06 11:27:02)
> 
> On 06/06/2018 10:28, Chris Wilson wrote:
> > Quoting Tvrtko Ursulin (2018-06-06 10:16:00)
> >>
> >> On 02/06/2018 10:38, Chris Wilson wrote:
> >>> Having allowed the user to define a set of engines that they will want
> >>> to only use, we go one step further and allow them to bind those engines
> >>> into a single virtual instance. Submitting a batch to the virtual engine
> >>> will then forward it to any one of the set in a manner as best to
> >>> distribute load.  The virtual engine has a single timeline across all
> >>> engines (it operates as a single queue), so it is not able to concurrently
> >>> run batches across multiple engines by itself; that is left up to the user
> >>> to submit multiple concurrent batches to multiple queues. Multiple users
> >>> will be load balanced across the system.
> >>>
> >>> The mechanism used for load balancing in this patch is a late greedy
> >>> balancer. When a request is ready for execution, it is added to each
> >>> engine's queue, and when an engine is ready for its next request it
> >>> claims it from the virtual engine. The first engine to do so, wins, i.e.
> >>> the request is executed at the earliest opportunity (idle moment) in the
> >>> system.
> >>>
> >>> As not all HW is created equal, the user is still able to skip the
> >>> virtual engine and execute the batch on a specific engine, all within the
> >>> same queue. It will then be executed in order on the correct engine,
> >>> with execution on other virtual engines being moved away due to the load
> >>> detection.
> >>>
> >>> Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
> >>>
> >>> Opens:
> >>>    - virtual takes priority
> >>>    - rescheduling after being gazumped
> >>>    - eliminating the irq
> >>> ---
> >>>    drivers/gpu/drm/i915/i915_gem.h            |   5 +
> >>>    drivers/gpu/drm/i915/i915_gem_context.c    |  81 ++++-
> >>>    drivers/gpu/drm/i915/i915_request.c        |   2 +-
> >>>    drivers/gpu/drm/i915/intel_engine_cs.c     |   3 +-
> >>>    drivers/gpu/drm/i915/intel_lrc.c           | 393 ++++++++++++++++++++-
> >>>    drivers/gpu/drm/i915/intel_lrc.h           |   6 +
> >>>    drivers/gpu/drm/i915/intel_ringbuffer.h    |   9 +
> >>>    drivers/gpu/drm/i915/selftests/intel_lrc.c | 177 ++++++++++
> >>>    include/uapi/drm/i915_drm.h                |  27 ++
> >>>    9 files changed, 697 insertions(+), 6 deletions(-)
> >>>
> >>
> >> [snip]
> >>
> >>> +struct intel_engine_cs *
> >>> +intel_execlists_create_virtual(struct i915_gem_context *ctx,
> >>> +                            struct intel_engine_cs **siblings,
> >>> +                            unsigned int count)
> >>> +{
> >>> +     struct virtual_engine *ve;
> >>> +     unsigned int n;
> >>> +     int err;
> >>> +
> >>> +     if (!count)
> >>> +             return ERR_PTR(-EINVAL);
> >>> +
> >>> +     ve = kzalloc(sizeof(*ve) + count * sizeof(*ve->siblings), GFP_KERNEL);
> >>> +     if (!ve)
> >>> +             return ERR_PTR(-ENOMEM);
> >>> +
> >>> +     kref_init(&ve->kref);
> >>> +     ve->base.i915 = ctx->i915;
> >>> +     ve->base.id = -1;
> >>
> >> 1)
> >>
> >> I had the idea to add a new engine virtual class, and set instances to
> >> real classes:
> >>
> >>          ve->(uabi_)class = <CLASS_VIRTUAL>;
> >>          ve->instance = parent->class;
> >>
> >> That would work fine in tracepoints (just need to remap class to uabi
> >> class for virtual engines).
> > 
> > Though conceptually it may be bonkers, are we ever going to be able to
> > mix classes? e.g. veng over bcs+vcs for very simple testcases like wsim.
> 
> I have settled for no mixing of classes. We started with that, apart for 
> a period where you wanted to mix, but then you also changed your mind, no?

Only for ease of implementation; pretty much we can allow mixing xcs,
just not rcs and anything else (for some value of easily). It all
depends on which instructions and context images are interchangeable. My
dream was to basically let the user do whatever they want :(

> > For simplicity to ve->uabi_class = VIRTUAL, allowing us to use ve->class
> > to make our lives easier. Also we would need to get reserve the id?
> 
> engine->id can remain -1 I think.
> 
> Not sure what your were aiming at with uabi_class and class. If we 
> reserve a virtual class in both namespaces you see a problem?

Once upon a time ve->class was being used to determine registers to load
into the context image.

> > Just trying to strike the right balance for the restrictions.
> >   
> >> 2)
> >>
> >> And I think it would also work for queued pmu I was thinking to export
> >> virtual classes as vcs-* nodes, in comparison to current vcs0-busy.
> >>
> >> vcs-queued/runnable/running would then contain aggregated counts for all
> >> virtual engines, while the vcsN-queued/.../... would contain only non
> >> virtual engine counts.
> > 
> > It's just finding them :)
> > 
> > i915->gt.class[].virtual_list.
> > 
> > Or just i915->gt.class[].engine_list and skip non-virtual.
> 
> Not sure if I would have to find them. I was hoping the existing "hooks" 
> would to the accounting for me by the virtue of something like:
> 
>         if (engine->class == VIRTUAL)
>                 atomic_inc(&i915->virtual_stats->runnable;
>         else
>                 atomic_inc(&engine->request_stats->runnable);
> 
> Etc..

No :-p

I'd rather push the work into the accumulator, i.e. when the user tries
to read the pmu event. I guess that it will be read far, far less often
than written to.
-Chris