[Intel-gfx] [PATCH 9/9] drm/i915/gt: Schedule request retirement when timeline idles
Chris Wilson
chris at chris-wilson.co.uk
Wed Nov 20 14:25:28 UTC 2019
Quoting Tvrtko Ursulin (2019-11-20 14:16:17)
>
> On 20/11/2019 13:39, Chris Wilson wrote:
> > Quoting Tvrtko Ursulin (2019-11-20 13:16:51)
> >>
> >> On 20/11/2019 09:33, Chris Wilson wrote:
> >>> +
> >>> + /*
> >>> + * Our goal here is to retire _idle_ timelines as soon as
> >>> + * possible (as they are idle, we do not expect userspace
> >>> + * to be cleaning up anytime soon).
> >>> + *
> >>> + * If the tl->active_count is already zero, someone else
> >>> + * should have retired the timeline. Equally if the timeline
> >>> + * is currently locked, either it is being retired elsewhere
> >>> + * or about to be!
> >>> + */
> >>> + if (atomic_read(&tl->active_count) &&
> >>> + mutex_trylock(&tl->mutex)) {
> >>> + retire_requests(tl);
> >>> + mutex_unlock(&tl->mutex);
> >>> + }
> >>> + intel_timeline_put(tl);
> >>> +
> >>> + GEM_BUG_ON(!next);
> >>> + tl = ptr_mask_bits(next, 1);
> >>
> >> You sometimes expect engine->retire to contain 0x1?
> >
> > Yes, imagine that we are submitting very fast such that we schedule_out
> > the same context before the worker ran, we would then try to
> > add_retire() the same timeline again. So I was using BIT(0) to tag an
> > active element in the retirement list.
>
> If we have schedule_out on the same context before the retire ran then
> add_retire is a no-op due tl->retire being set.
The first element in the list has timeline->retire = NULL. We only know
it is set because we make it timeline->retire = 1 instead.
> It could be a race between engine_retire and add_retire where
> engine->retire is set to NULL so latter would set tl->retire to NULL |
> BIT(0).
>
> But BIT(0) is only filtered out and not acted upon anywhere so I am
> still lost.
>
> So maybe from the opposite angle, what goes wrong if you drop all BIT(0)
> business?
You insert the element into the list twice causing a loop. Now that loop
is short lived as we clear timeline->retire on processing, but we may
still end up dropping a reference twice.
>
> >
> >>> + } while (tl);
> >>> +}
> >>> +
> >>> +static bool add_retire(struct intel_engine_cs *engine,
> >>> + struct intel_timeline *tl)
> >>> +{
> >>> + struct intel_timeline *first = READ_ONCE(engine->retire);
> >>> +
> >>> + /*
> >>> + * We open-code a llist here to include the additional tag [BIT(0)]
> >>> + * so that we know when the timeline is already on a
> >>> + * retirement queue: either this engine or another.
> >>> + *
> >>> + * However, we rely on that a timeline can only be active on a single
> >>> + * engine at any one time and that add_retire() is called before the
> >>> + * engine releases the timeline and transferred to another to retire.
> >>> + */
> >>> +
> >>> + if (READ_ONCE(tl->retire)) /* already queued */
> >>> + return false;
> >>
> >> Can't this go first in the function?
> >
> > Conceptually it is. And I made it so because I also decided against
> > having the READ_ONCE() at the top.
>
> But READ_ONCE is at the top, just a different one which is discarded if
> this one is NULL.
static bool add_retire(struct intel_engine_cs *engine,
struct intel_timeline *tl)
{
struct intel_timeline *first;
/*
* We open-code a llist here to include the additional tag [BIT(0)]
* so that we know when the timeline is already on a
* retirement queue: either this engine or another.
*
* However, we rely on that a timeline can only be active on a single
* engine at any one time and that add_retire() is called before the
* engine releases the timeline and transferred to another to retire.
*/
if (READ_ONCE(tl->retire)) /* already queued */
return false;
if (!atomic_read(&tl->active_count)) /* already retired */
return false;
intel_timeline_get(tl);
first = READ_ONCE(engine->retire);
do
tl->retire = ptr_pack_bits(first, 1, 1);
while (!try_cmpxchg(&engine->retire, &first, tl));
return !first;
}
is the current incarnation.
> >>> +
> >>> + intel_timeline_get(tl);
> >>> + do
> >>> + tl->retire = ptr_pack_bits(first, 1, 1);
> >>
> >> Here you rely on assignment being atomic right?
> >
> > Ish. Here we rely on the timeline being owned by the engine so it cannot
> > be submitted by another (and so schedule_out called) until this engine
> > has released it.
> >
> > It is a weak point for generality, but the ordering is strong in
> > execlists.
> >
> >>> + while (!try_cmpxchg(&engine->retire, &first, tl));
> >>
> >> So the loop is effectively creating a chain of timelines to retire on
> >> this engine.
> >>
> >> What happens with virtual engines when a timeline goes to different
> >> engine before (well or any single timeline context) the retire worker
> >> runs? Ah okay, it gets re-assigned to the most recent engine.
> >
> > Right. The engine_retire() doesn't care which engine the timelines were
> > run on, it's just a list of suspected idle timelines.
> >
> >> I am not sure about the BIT(0) business. It's always set on write so I
> >> am not getting why it is useful.
> >
> > It's also set to 0 on consumption :)
>
> Another thing - assert that engine->retire is NULL after flush_work in
> intel_engine_fini_retire could be useful.
Yup, wilco.
-Chris
More information about the Intel-gfx
mailing list