[Intel-gfx] [PATCH 4/5] drm/i915/gt: Adopt engine_park synchronisation rules for engine_retire

Thu Nov 21 14:42:56 UTC 2019

On 21/11/2019 13:51, Chris Wilson wrote:
> In the next patch, we will introduce a new asynchronous retirement
> worker, fed by execlists CS events. Here we may queue a retirement as
> soon as a request is submitted to HW (and completes instantly), and we
> also want to process that retirement as early as possible and cannot
> afford to postpone (as there may not be another opportunity to retire it
> for a few seconds). To allow the new async retirer to run in parallel
> with our submission, pull the __i915_request_queue (that passes the
> request to HW) inside the timelines spinlock so that the retirement
> cannot release the timeline before we have completed the submission.
> 
> Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
> Cc: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
> ---
>   drivers/gpu/drm/i915/gt/intel_engine_pm.c | 29 ++++++++++++++++-------
>   1 file changed, 21 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_pm.c b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
> index 373a4b9f159c..bd0af02bea16 100644
> --- a/drivers/gpu/drm/i915/gt/intel_engine_pm.c
> +++ b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
> @@ -74,18 +74,33 @@ static inline void __timeline_mark_unlock(struct intel_context *ce,
>   #endif /* !IS_ENABLED(CONFIG_LOCKDEP) */
>   
>   static void
> -__intel_timeline_enter_and_release_pm(struct intel_timeline *tl,
> -				      struct intel_engine_cs *engine)
> +__queue_and_release_pm(struct i915_request *rq,
> +		       struct intel_timeline *tl,
> +		       struct intel_engine_cs *engine)
>   {
>   	struct intel_gt_timelines *timelines = &engine->gt->timelines;
>   
> +	/*
> +	 * We have to serialise all potential retirement paths with our
> +	 * submission, as we don't want to underflow either the
> +	 * engine->wakeref.counter or our timeline->active_count.
> +	 *
> +	 * Equally, we cannot allow a new submission to start until
> +	 * after we finish queueing, nor could we allow that submitter
> +	 * to retire us before we are ready!
> +	 */
>   	spin_lock(&timelines->lock);
>   
> -	if (!atomic_fetch_inc(&tl->active_count))
> -		list_add_tail(&tl->link, &timelines->active_list);
> +	/* Hand the request over to HW and so engine_retire() */
> +	__i915_request_queue(rq, NULL);
>   
> +	/* Let new submissions commence (and maybe retire this timeline) */
>   	__intel_wakeref_defer_park(&engine->wakeref);
>   
> +	/* Let intel_gt_retire_requests() retire us */
> +	if (!atomic_fetch_inc(&tl->active_count))
> +		list_add_tail(&tl->link, &timelines->active_list);
> +
>   	spin_unlock(&timelines->lock);

Now that everything is under the lock the order of operation is not 
important, or it still is?

Regards,

Tvrtko

>   }
>   
> @@ -148,10 +163,8 @@ static bool switch_to_kernel_context(struct intel_engine_cs *engine)
>   	rq->sched.attr.priority = I915_PRIORITY_BARRIER;
>   	__i915_request_commit(rq);
>   
> -	__i915_request_queue(rq, NULL);
> -
> -	/* Expose ourselves to intel_gt_retire_requests() and new submission */
> -	__intel_timeline_enter_and_release_pm(ce->timeline, engine);
> +	/* Expose ourselves to the world */
> +	__queue_and_release_pm(rq, ce->timeline, engine);
>   
>   	result = false;
>   out_unlock:
>