[Intel-gfx] [PATCH] drm/i915: Neuter virtual rq->engine on retire
Chris Wilson
chris at chris-wilson.co.uk
Tue May 19 17:00:04 UTC 2020
Quoting Chris Wilson (2020-05-19 15:51:31)
> We do not hold a reference to rq->engine, and so if it is a virtual
> engine it may have already been freed by the time we free the request.
> The last reference we hold on the virtual engine is via rq->context,
> and that is released on request retirement. So if we find ourselves
> retiring a virtual request, redirect it to a real sibling.
>
> Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/1906
> Fixes: 43acd6516ca9 ("drm/i915: Keep a per-engine request pool")
> Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
> Cc: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
> ---
> drivers/gpu/drm/i915/i915_request.c | 17 +++++++++++++++++
> 1 file changed, 17 insertions(+)
>
> diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
> index 31ef683d27b4..a816218cc693 100644
> --- a/drivers/gpu/drm/i915/i915_request.c
> +++ b/drivers/gpu/drm/i915/i915_request.c
> @@ -242,9 +242,26 @@ static void remove_from_engine(struct i915_request *rq)
> spin_lock(&engine->active.lock);
> locked = engine;
> }
> +
> list_del_init(&rq->sched.link);
> clear_bit(I915_FENCE_FLAG_PQUEUE, &rq->fence.flags);
> clear_bit(I915_FENCE_FLAG_HOLD, &rq->fence.flags);
> +
> + /*
> + * During i915_fence_release we stash one request on the
> + * rq->engine for use as an emergency reserve. However, we
> + * neither want to keep a request on a virtual engine, nor do
> + * we hold a reference to a virtual engine at that point. So
> + * if rq->engine is virtual, replace it with a real one. Which
> + * one is immaterial at this point as the request has been
> + * retired, and if it was a virtual engine will not have any
> + * signaling or other related paraphernalia.
> + *
> + * However, it would be nice if we didn't have to...
> + */
> + if (intel_engine_is_virtual(rq->engine))
Hmm. execlists_dequeue will assert that rq->engine == veng before
finding out that the request was completed. Annoyingly we would need
some veng magic to cmpxchg(&ve->request, rq, NULL)
> + rq->engine = intel_virtual_engine_get_sibling(rq->engine, 0);
Back to the drawing board for a bit. Although removing the assert might
be the easiest course of action.
-Chris
More information about the Intel-gfx
mailing list