[Intel-gfx] [PATCH 15/18] drm/i915: Load balancing across a virtual engine
Chris Wilson
chris at chris-wilson.co.uk
Thu Mar 21 15:28:55 UTC 2019
Quoting Tvrtko Ursulin (2019-03-21 15:13:59)
>
> On 21/03/2019 15:00, Chris Wilson wrote:
> > Quoting Tvrtko Ursulin (2019-03-20 15:59:14)
> >>
> >> On 19/03/2019 11:57, Chris Wilson wrote:
> >>> +static void virtual_submission_tasklet(unsigned long data)
> >>> +{
> >>> + struct virtual_engine * const ve = (struct virtual_engine *)data;
> >>> + unsigned int n;
> >>> + int prio;
> >>> +
> >>> + prio = READ_ONCE(ve->base.execlists.queue_priority_hint);
> >>> + if (prio == INT_MIN)
> >>> + return;
> >>
> >> When does it hit this return?
> >
> > If the tasklet runs when I don't expect it to. Should be never indeed.
>
> GEM_BUG_ON then, or a GEM_WARN_ON to start with?
>
> > At least with bonding it becomes something a bit more tangible.
>
> Hmm how so?
Instead of prio == INT_MIN, we compute the execution mask which can end
up being 0 due to user carelessness.
> >>> + local_irq_disable();
> >>> + for (n = 0; READ_ONCE(ve->request) && n < ve->count; n++) {
> >>> + struct intel_engine_cs *sibling = ve->siblings[n];
> >>> + struct ve_node * const node = &ve->nodes[sibling->id];
> >>> + struct rb_node **parent, *rb;
> >>> + bool first;
> >>> +
> >>> + spin_lock(&sibling->timeline.lock);
> >>> +
> >>> + if (!RB_EMPTY_NODE(&node->rb)) {
> >>> + /*
> >>> + * Cheat and avoid rebalancing the tree if we can
> >>> + * reuse this node in situ.
> >>> + */
> >>> + first = rb_first_cached(&sibling->execlists.virtual) ==
> >>> + &node->rb;
> >>> + if (prio == node->prio || (prio > node->prio && first))
> >>> + goto submit_engine;
> >>> +
> >>> + rb_erase_cached(&node->rb, &sibling->execlists.virtual);
> >>
> >> How the cheat works exactly? Avoids inserting into the tree in some
> >> cases? And how does the real tasklet find this node then?
> >
> > It's already in the sibling->execlists.virtual, so we don't need to
> > remove the node and reinsert it. So when we kick the sibling's tasklet
> > it is right there.
>
> So the cheat bit is just the prio and first check and the erase on
> non-empty node is normal operation?
Yes. As we have to update the priority, which means rebalancing the
tree. Given this is an rbtree, that means remove & insert. (If the node
isn't already in the tree, we skip to the insert.)
> >>> + intel_context_init(&ve->context, ctx, &ve->base);
> >>> +
> >>> + for (n = 0; n < count; n++) {
> >>> + struct intel_engine_cs *sibling = siblings[n];
> >>> +
> >>> + GEM_BUG_ON(!is_power_of_2(sibling->mask));
> >>> + if (sibling->mask & ve->base.mask)
> >>> + continue;
> >>
> >> Continuing from the previous round - should we -EINVAL if two of the
> >> same are found in the map? Since we are going to silently drop it..
> >> perhaps better to disallow.
> >
> > Could do. I have no strong use case that expects to be able to handle
> > the user passing in (vcs1, vcs2, vcs2).
> >
> > The really important question, is do we always create a virtual engine
> > even wrapping a single physical engine.
> >
> > The more I ask myself, the answer is yes. (Primarily so that we can
> > create multiple instances of the same engine with different logical
> > contexts and sseus. Oh flip, i915_perf needs updating to find virtual
> > contexts.). It's just that submitting a stream of nops to a virtual engine
> > is 3x as expensive as a real engine.
>
> Hmmm I think we do need to. Or mandate single timeline before veng can
> be created. Otherwise I think we break the contract of multi-timeline
> having each own GPU context.
Yeah, following that to the logical conclusion, each entry in
ctx->engines[] should be an independence of instance. That makes sense,
and is what I would expect (if I put 2 rcs0 into engines[], then I want
2 rcs contexts!)
> > It's just that you mentioned that userspace ended up wrapping everything
> > inside a virtual engine willy-nilly, that spells trouble.
>
> We can always say -EINVAL to that since it hardly makes sense. If they
> want contexts they can create real ones with ppgtt sharing.
I have a plan for lightweight logical engines.
-Chris
More information about the Intel-gfx
mailing list