[Intel-gfx] [PATCH 01/37] drm/i915/gem: Reduce context termination list iteration guard to RCU

Tvrtko Ursulin tvrtko.ursulin at linux.intel.com
Wed Aug 5 15:02:02 UTC 2020


On 05/08/2020 13:21, Chris Wilson wrote:
> As we now protect the timeline list using RCU, we can drop the
> timeline->mutex for guarding the list iteration during context close, as
> we are searching for an inflight request. Any new request will see the
> context is banned and not be submitted. In doing so, pull the checks for
> a concurrent submission of the request (notably the
> i915_request_completed()) under the engine spinlock, to fully serialise
> with __i915_request_submit()). That is in the case of preempt-to-busy
> where the request may be completed during the __i915_request_submit(),
> we need to be careful that we sample the request status after
> serialising so that we don't miss the request the engine is actually
> submitting.
> 
> Fixes: 4a3174152147 ("drm/i915/gem: Refine occupancy test in kill_context()")
> References: d22d2d073ef8 ("drm/i915: Protect i915_request_await_start from early waits") # rcu protection of timeline->requests
> References: https://gitlab.freedesktop.org/drm/intel/-/issues/1622
> Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
> ---
>   drivers/gpu/drm/i915/gem/i915_gem_context.c | 32 ++++++++++++---------
>   1 file changed, 19 insertions(+), 13 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> index d8cccbab7a51..db893f6c516b 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> @@ -439,29 +439,36 @@ static bool __cancel_engine(struct intel_engine_cs *engine)
>   	return __reset_engine(engine);
>   }
>   
> -static struct intel_engine_cs *__active_engine(struct i915_request *rq)
> +static bool
> +__active_engine(struct i915_request *rq, struct intel_engine_cs **active)
>   {
>   	struct intel_engine_cs *engine, *locked;
> +	bool ret = false;
>   
>   	/*
>   	 * Serialise with __i915_request_submit() so that it sees
>   	 * is-banned?, or we know the request is already inflight.
> +	 *
> +	 * Note that rq->engine is unstable, and so we double
> +	 * check that we have acquired the lock on the final engine.
>   	 */
>   	locked = READ_ONCE(rq->engine);
>   	spin_lock_irq(&locked->active.lock);
>   	while (unlikely(locked != (engine = READ_ONCE(rq->engine)))) {
>   		spin_unlock(&locked->active.lock);
> -		spin_lock(&engine->active.lock);
>   		locked = engine;
> +		spin_lock(&locked->active.lock);
>   	}
>   
> -	engine = NULL;
> -	if (i915_request_is_active(rq) && rq->fence.error != -EIO)
> -		engine = rq->engine;
> +	if (!i915_request_completed(rq)) {
> +		if (i915_request_is_active(rq) && rq->fence.error != -EIO)
> +			*active = locked;
> +		ret = true;

So not completed but also not submitted will return true and no engine..

> +	}
>   
>   	spin_unlock_irq(&locked->active.lock);
>   
> -	return engine;
> +	return ret;
>   }
>   
>   static struct intel_engine_cs *active_engine(struct intel_context *ce)
> @@ -472,17 +479,16 @@ static struct intel_engine_cs *active_engine(struct intel_context *ce)
>   	if (!ce->timeline)
>   		return NULL;
>   
> -	mutex_lock(&ce->timeline->mutex);
> -	list_for_each_entry_reverse(rq, &ce->timeline->requests, link) {
> -		if (i915_request_completed(rq))
> -			break;
> +	rcu_read_lock();
> +	list_for_each_entry_rcu(rq, &ce->timeline->requests, link) {
> +		if (i915_request_is_active(rq) && i915_request_completed(rq))
> +			continue;
>   
>   		/* Check with the backend if the request is inflight */
> -		engine = __active_engine(rq);
> -		if (engine)
> +		if (__active_engine(rq, &engine))
>   			break;

... hence the caller of this will say no action. Because not active 
means not submitted so that's okay and matches old behaviour. Need for 
bool return and output engine looks a consequence of iterating the list 
in different direction.

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin at intel.com>

Regards,

Tvrtko

>   	}
> -	mutex_unlock(&ce->timeline->mutex);
> +	rcu_read_unlock();
>   
>   	return engine;
>   }
> 


More information about the Intel-gfx mailing list