[Intel-gfx] [PATCH 06/20] drm/i915/gt: Remove virtual breadcrumb before transfer

Matthew Brost matthew.brost at intel.com
Thu Dec 10 17:28:00 UTC 2020


On Mon, Dec 07, 2020 at 07:38:10PM +0000, Chris Wilson wrote:
> The issue with stale virtual breadcrumbs remain. Now we have the problem
> that if the irq-signaler is still referencing the stale breadcrumb as we
> transfer it to a new sibling, the list becomes spaghetti. This is a very
> small window, but that doesn't stop it being hit infrequently. To
> prevent the lists being tangled (the iterator starting on one engine's
> b->signalers but walking onto another list), always decouple the virtual
> breadcrumb on schedule-out and make sure that the walker has stepped out
> of the lists.
> 
> Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>

Makes sense to me.
Reviewed-by: Matthew Brost <matthew.brost at intel.com>

> ---
>  drivers/gpu/drm/i915/gt/intel_breadcrumbs.c |  5 +++--
>  drivers/gpu/drm/i915/gt/intel_lrc.c         | 15 +++++++++++++++
>  2 files changed, 18 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c b/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c
> index 00918300f53f..63900edbde88 100644
> --- a/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c
> +++ b/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c
> @@ -454,15 +454,16 @@ void i915_request_cancel_breadcrumb(struct i915_request *rq)
>  {
>  	struct intel_breadcrumbs *b = READ_ONCE(rq->engine)->breadcrumbs;
>  	struct intel_context *ce = rq->context;
> +	unsigned long flags;
>  	bool release;
>  
>  	if (!test_and_clear_bit(I915_FENCE_FLAG_SIGNAL, &rq->fence.flags))
>  		return;
>  
> -	spin_lock(&ce->signal_lock);
> +	spin_lock_irqsave(&ce->signal_lock, flags);
>  	list_del_rcu(&rq->signal_link);
>  	release = remove_signaling_context(b, ce);
> -	spin_unlock(&ce->signal_lock);
> +	spin_unlock_irqrestore(&ce->signal_lock, flags);
>  	if (release)
>  		intel_context_put(ce);
>  
> diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
> index 0b7e86240f3b..b3db16b2a5a4 100644
> --- a/drivers/gpu/drm/i915/gt/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
> @@ -1388,6 +1388,21 @@ static inline void execlists_schedule_in(struct i915_request *rq, int idx)
>  static void kick_siblings(struct i915_request *rq, struct intel_context *ce)
>  {
>  	struct virtual_engine *ve = container_of(ce, typeof(*ve), context);
> +	struct intel_engine_cs *engine = rq->engine;
> +
> +	/* Flush concurrent rcu iterators in signal_irq_work */
> +	if (test_bit(DMA_FENCE_FLAG_ENABLE_SIGNAL_BIT, &rq->fence.flags)) {
> +		/*
> +		 * After this point, the rq may be transferred to a new
> +		 * sibling, so before we clear ce->inflight make sure that
> +		 * the context has been removed from the b->signalers and
> +		 * furthermore we need to make sure that the concurrent
> +		 * iterator in signal_irq_work is no longer following
> +		 * ce->signal_link.
> +		 */
> +		i915_request_cancel_breadcrumb(rq);
> +		irq_work_sync(&engine->breadcrumbs->irq_work);
> +	}
>  
>  	if (READ_ONCE(ve->request))
>  		tasklet_hi_schedule(&ve->base.execlists.tasklet);
> -- 
> 2.20.1
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx


More information about the Intel-gfx mailing list