[Intel-gfx] [PATCH 2/2] drm/i915: Track the previous pinned context inside the request

Wed Apr 20 14:08:19 UTC 2016

On 19/04/16 13:59, Chris Wilson wrote:
> As the contexts are accessed by the hardware until the switch is completed
> to a new context, the hardware may still be writing to the context object
> after the breadcrumb is visible. We must not unpin/unbind/prune that
> object whilst still active and so we keep the previous context pinned until
> the following request. If we move this tracking onto the request, we can
> simplify the code and treat execlists/GuC dispatch identically.
>
> Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
> ---
>   drivers/gpu/drm/i915/i915_drv.h  | 11 +++++++++++
>   drivers/gpu/drm/i915/i915_gem.c  |  8 ++++----
>   drivers/gpu/drm/i915/intel_lrc.c | 17 ++++++++---------
>   3 files changed, 23 insertions(+), 13 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index c59b2670cc36..be98e9643072 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -2302,6 +2302,17 @@ struct drm_i915_gem_request {
>   	struct intel_context *ctx;
>   	struct intel_ringbuffer *ringbuf;
>
> +	/**
> +	 * Context related to the previous request.
> +	 * As the contexts are accessed by the hardware until the switch is
> +	 * completed to a new context, the hardware may still be writing
> +	 * to the context object after the breadcrumb is visible. We must
> +	 * not unpin/unbind/prune that object whilst still active and so
> +	 * we keep the previous context pinned until the following (this)
> +	 * request is retired.
> +	 */
> +	struct intel_context *previous_context;
> +
>   	/** Batch buffer related to this request if any (used for
>   	    error state dump only) */
>   	struct drm_i915_gem_object *batch_obj;
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 9b4854a17264..537aacfda3eb 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -1413,13 +1413,13 @@ static void i915_gem_request_retire(struct drm_i915_gem_request *request)
>   	list_del_init(&request->list);
>   	i915_gem_request_remove_from_client(request);
>
> -	if (request->ctx) {
> +	if (request->previous_context) {
>   		if (i915.enable_execlists)
> -			intel_lr_context_unpin(request->ctx, request->engine);
> -
> -		i915_gem_context_unreference(request->ctx);
> +			intel_lr_context_unpin(request->previous_context,
> +					       request->engine);
>   	}
>
> +	i915_gem_context_unreference(request->ctx);
>   	i915_gem_request_unreference(request);
>   }
>
> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> index ee4e9bb80042..06e013293ec6 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> @@ -590,7 +590,6 @@ static void execlists_context_queue(struct drm_i915_gem_request *request)
>   	struct drm_i915_gem_request *cursor;
>   	int num_elements = 0;
>
> -	intel_lr_context_pin(request->ctx, request->engine);

I really really think this must go in a separate, subsequent patch.

Both from the conceptual side, leaving this patch to just extend 
pinning, not limit it; and from the POV that there is a bug unless a 
patch like mine which I pasted in yesterday is inserted between them 
("drm/i915: Store LRC hardware id in the context", note the summary is 
wrong, it is storing in requests not contexts so I have to rename it).

Otherwise execlists_check_remove_request when accessing head_req->ctx is 
use after free. And I can demonstrate that easily via gem-close-race. 
Put a WARN_ON(atomic_read(&head_req->ctx->ref.refcount) == 0); and see. :)

What I think happens is that with two submission ports, we can get two 
context completions aggregated in an interrupt which comes after the 
seqno for both has been consumed by GEM and so LRCs unpinned.

But with your persistent ctx hw id patches, I think the course is fine 
to do this including the complete elimination of the execlist retired queue.

You can just drop the two chunks for the patch and I will follow up with 
two patches to finish it all off.

>   	i915_gem_request_reference(request);
>
>   	spin_lock_bh(&engine->execlist_lock);
> @@ -788,12 +787,14 @@ intel_logical_ring_advance_and_submit(struct drm_i915_gem_request *request)
>   	if (intel_engine_stopped(engine))
>   		return 0;
>
> -	if (engine->last_context != request->ctx) {
> -		if (engine->last_context)
> -			intel_lr_context_unpin(engine->last_context, engine);
> -		intel_lr_context_pin(request->ctx, engine);
> -		engine->last_context = request->ctx;
> -	}
> +	/* We keep the previous context alive until we retire the following
> +	 * request. This ensures that any the context object is still pinned
> +	 * for any residual writes the HW makes into it on the context switch
> +	 * into the next object following the breadcrumb. Otherwise, we may
> +	 * retire the context too early.
> +	 */
> +	request->previous_context = engine->last_context;
> +	engine->last_context = request->ctx;
>
>   	if (dev_priv->guc.execbuf_client)
>   		i915_guc_submit(dev_priv->guc.execbuf_client, request);
> @@ -1015,8 +1016,6 @@ void intel_execlists_retire_requests(struct intel_engine_cs *engine)
>   	spin_unlock_bh(&engine->execlist_lock);
>
>   	list_for_each_entry_safe(req, tmp, &retired_list, execlist_link) {
> -		intel_lr_context_unpin(req->ctx, engine);
> -
>   		list_del(&req->execlist_link);
>   		i915_gem_request_unreference(req);
>   	}
>

Regards,

Tvrtko