[Intel-gfx] [PATCH 10/13] drm/i915: Nonblocking request submission

Fri Aug 26 16:14:23 UTC 2016

On Fri, Aug 26, 2016 at 02:39:01PM +0100, John Harrison wrote:
> On 25/08/2016 10:08, Chris Wilson wrote:
> >Now that we have fences in place to drive request submission, we can
> >employ those to queue requests after their dependencies as opposed to
> >stalling in the middle of an execbuf ioctl. (However, we still choose to
> >spin before enabling the IRQ as that is faster - though contentious.)
> >
> >Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
> >---
> >  drivers/gpu/drm/i915/i915_gem_execbuffer.c | 10 +++++++---
> >  drivers/gpu/drm/i915/i915_gem_request.c    | 14 +++++++++++++-
> >  2 files changed, 20 insertions(+), 4 deletions(-)
> >
> >diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> >index b7cc158733ad..104c713ec681 100644
> >--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> >+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> >@@ -1136,9 +1136,13 @@ __eb_sync(struct drm_i915_gem_request *to,
> >  	trace_i915_gem_ring_sync_to(to, from);
> >  	if (!i915.semaphores) {
> >-		ret = i915_wait_request(from, true, NULL, NO_WAITBOOST);
> >-		if (ret)
> >-			return ret;
> >+		if (!i915_spin_request(from, TASK_INTERRUPTIBLE, 2)) {
> >+			ret = i915_sw_fence_await_dma_fence(&to->submit,
> >+							    &from->fence,
> Why not use 'i915_sw_fence_await_sw_fence(from->submit)' as below?
> Or conversely, why not use '_await_dma_fence(prev->fence)' below?

On the same engine the requests are in execution order and so once the
first is ready to submit so are its dependents. This extends naturally
to timelines. Between engines, we have to wait until the request is
complete before we can submit the second (note this applies to the
!semaphore branch).

> >+							    GFP_KERNEL);
> >+			if (ret < 0)
> >+				return ret;
> >+		}
> >  	} else {
> >  		ret = to->engine->semaphore.sync_to(to, from);
> >  		if (ret)
> >diff --git a/drivers/gpu/drm/i915/i915_gem_request.c b/drivers/gpu/drm/i915/i915_gem_request.c
> >index db45482ea194..d3477dabb534 100644
> >--- a/drivers/gpu/drm/i915/i915_gem_request.c
> >+++ b/drivers/gpu/drm/i915/i915_gem_request.c
> >@@ -345,7 +345,7 @@ i915_gem_request_alloc(struct intel_engine_cs *engine,
> >  {
> >  	struct drm_i915_private *dev_priv = engine->i915;
> >  	unsigned int reset_counter = i915_reset_counter(&dev_priv->gpu_error);
> >-	struct drm_i915_gem_request *req;
> >+	struct drm_i915_gem_request *req, *prev;
> >  	u32 seqno;
> >  	int ret;
> >@@ -441,6 +441,18 @@ i915_gem_request_alloc(struct intel_engine_cs *engine,
> >  	 */
> >  	req->head = req->ring->tail;
> >+	prev = i915_gem_active_peek(&engine->last_request,
> >+				    &req->i915->drm.struct_mutex);
> >+	if (prev) {
> >+		ret = i915_sw_fence_await_sw_fence(&req->submit,
> >+						   &prev->submit,
> >+						   GFP_KERNEL);
> Is this only required to guarantee seqno ordering?

Correct. We only have a single global timeline at the moment. So
requests are ordered within that timeline.

> If the previous request is not sharing any objects, context, etc.
> then there is no fundamental reason why this request should be
> dependent upon the last one.

Indeed. This becomes timeline->last_request and then at the time of
execution (when all dependencies are resolved) do we know its global
ordering - which remains important for being able to keep fast wakeups,
retirements etc. With preemptions being an exception and generally a wart
on the side of everything.
-chris

-- 
Chris Wilson, Intel Open Source Technology Centre