[Intel-gfx] [PATCH 08/16] drm/i915: Unify active context tracking between legacy/execlists/guc

Fri Dec 9 12:17:35 UTC 2016

On Fri, Dec 09, 2016 at 11:48:15AM +0000, Tvrtko Ursulin wrote:
> 
> On 07/12/2016 13:58, Chris Wilson wrote:
> >The requests conversion introduced a nasty bug where we could generate a
> >new request in the middle of constructing a request. The new request
> >would be executed (and waited upon) before the current one, creating a
> >minor havoc in the seqno accounting. (Prior to deferred seqno
> >assignment, this would mean that the seqno would be out of order, and
> >the current request would be deemed complete even before it was
> >submitted.)
> 
> How exactly it can generate and submit a request while creating a
> request? It would be good to describe it here.

i915_switch_context can create a request in the middle of servicing the
request. It used to cause the seqno to advance upsetting the tracking
(causing a chance that this the actual request is retired early),
just fortunate that there are no earlier instructions to mess up the
ring->tail.

> >We also employed two different mechanisms to track the active context
> >until it was switched out. The legacy method allowed for waiting upon an
> >active context (it could forcibly evict any vma, including context's),
> >but the execlists method took a step backwards by pinning the vma for
> >the entire active lifespan of the context (the only way to evict was to
> >idle the entire GPU, not individual contexts). However, to circumvent
> 
> Because of the pin in request creation?

Because of the unpin in retirement.

Legacy, pin in create, unpin on the next request.
Execlists, pin in create, unpin on retirment.

To unbind a context in legacy, just requires selecting it for eviction.
To unbind a context in execlists, requires idling the GPU.

Now both require idling the GPU, which is more of an issue for the
shared GTT in legacy - but hopefully the contexts were the last victims
anyway.

Hmm. Probably wise to restore the retire at the start of eviction.
We removed it because we had faith in our inactive tracking, however we
know have stray pins to clear.

> >+	/* The backing object for the context is done after switching to the
> >+	 * *next* context. Therefore we cannot retire the previous context until
> >+	 * the next context has already started running. However, since we
> >+	 * cannot take the required locks at i915_gem_request_submit() we
> >+	 * defer the unpinning of the active context to now, retirement of
> >+	 * the subsequent request.
> >+	 */
> >+	if (engine->last_context)
> >+		engine->context_unpin(engine, engine->last_context);
> >+	engine->last_context = request->ctx;
> 
> Would it be worth renaming this member to last_retired_context for clarity?

Definitely makes it clearer.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre