[PATCH 75/76] RFC drm/i915: Load balancing across a virtual engine
Chris Wilson
chris at chris-wilson.co.uk
Wed Jun 6 09:28:59 UTC 2018
Quoting Tvrtko Ursulin (2018-06-06 10:16:00)
>
> On 02/06/2018 10:38, Chris Wilson wrote:
> > Having allowed the user to define a set of engines that they will want
> > to only use, we go one step further and allow them to bind those engines
> > into a single virtual instance. Submitting a batch to the virtual engine
> > will then forward it to any one of the set in a manner as best to
> > distribute load. The virtual engine has a single timeline across all
> > engines (it operates as a single queue), so it is not able to concurrently
> > run batches across multiple engines by itself; that is left up to the user
> > to submit multiple concurrent batches to multiple queues. Multiple users
> > will be load balanced across the system.
> >
> > The mechanism used for load balancing in this patch is a late greedy
> > balancer. When a request is ready for execution, it is added to each
> > engine's queue, and when an engine is ready for its next request it
> > claims it from the virtual engine. The first engine to do so, wins, i.e.
> > the request is executed at the earliest opportunity (idle moment) in the
> > system.
> >
> > As not all HW is created equal, the user is still able to skip the
> > virtual engine and execute the batch on a specific engine, all within the
> > same queue. It will then be executed in order on the correct engine,
> > with execution on other virtual engines being moved away due to the load
> > detection.
> >
> > Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
> >
> > Opens:
> > - virtual takes priority
> > - rescheduling after being gazumped
> > - eliminating the irq
> > ---
> > drivers/gpu/drm/i915/i915_gem.h | 5 +
> > drivers/gpu/drm/i915/i915_gem_context.c | 81 ++++-
> > drivers/gpu/drm/i915/i915_request.c | 2 +-
> > drivers/gpu/drm/i915/intel_engine_cs.c | 3 +-
> > drivers/gpu/drm/i915/intel_lrc.c | 393 ++++++++++++++++++++-
> > drivers/gpu/drm/i915/intel_lrc.h | 6 +
> > drivers/gpu/drm/i915/intel_ringbuffer.h | 9 +
> > drivers/gpu/drm/i915/selftests/intel_lrc.c | 177 ++++++++++
> > include/uapi/drm/i915_drm.h | 27 ++
> > 9 files changed, 697 insertions(+), 6 deletions(-)
> >
>
> [snip]
>
> > +struct intel_engine_cs *
> > +intel_execlists_create_virtual(struct i915_gem_context *ctx,
> > + struct intel_engine_cs **siblings,
> > + unsigned int count)
> > +{
> > + struct virtual_engine *ve;
> > + unsigned int n;
> > + int err;
> > +
> > + if (!count)
> > + return ERR_PTR(-EINVAL);
> > +
> > + ve = kzalloc(sizeof(*ve) + count * sizeof(*ve->siblings), GFP_KERNEL);
> > + if (!ve)
> > + return ERR_PTR(-ENOMEM);
> > +
> > + kref_init(&ve->kref);
> > + ve->base.i915 = ctx->i915;
> > + ve->base.id = -1;
>
> 1)
>
> I had the idea to add a new engine virtual class, and set instances to
> real classes:
>
> ve->(uabi_)class = <CLASS_VIRTUAL>;
> ve->instance = parent->class;
>
> That would work fine in tracepoints (just need to remap class to uabi
> class for virtual engines).
Though conceptually it may be bonkers, are we ever going to be able to
mix classes? e.g. veng over bcs+vcs for very simple testcases like wsim.
For simplicity to ve->uabi_class = VIRTUAL, allowing us to use ve->class
to make our lives easier. Also we would need to get reserve the id?
Just trying to strike the right balance for the restrictions.
> 2)
>
> And I think it would also work for queued pmu I was thinking to export
> virtual classes as vcs-* nodes, in comparison to current vcs0-busy.
>
> vcs-queued/runnable/running would then contain aggregated counts for all
> virtual engines, while the vcsN-queued/.../... would contain only non
> virtual engine counts.
It's just finding them :)
i915->gt.class[].virtual_list.
Or just i915->gt.class[].engine_list and skip non-virtual.
> It is a tiny bit hackish but we still get to export GPU load so sounds
> okay to me.
Seems a reasonable argument. Restricting veng to one class, then being
able to summaries all vengs as one super-virtual instance, sounds a
reasonable trade-off and selling point.
-Chris
More information about the Intel-gfx-trybot
mailing list