[Intel-gfx] [PATCH v2 03/11] drm/i915: Defer transfer onto execution timeline to actual hw submission
Chris Wilson
chris at chris-wilson.co.uk
Thu Nov 10 14:43:49 UTC 2016
On Thu, Nov 10, 2016 at 11:51:27AM +0000, Tvrtko Ursulin wrote:
>
> On 10/11/2016 11:11, Chris Wilson wrote:
> >On Thu, Nov 10, 2016 at 10:43:29AM +0000, Tvrtko Ursulin wrote:
> >>
> >>On 07/11/2016 13:59, Chris Wilson wrote:
> >>>Defer the transfer from the client's timeline onto the execution
> >>>timeline from the point of readiness to the point of actual submission.
> >>>For example, in execlists, a request is finally submitted to hardware
> >>>when the hardware is ready, and only put onto the hardware queue when
> >>>the request is ready. By deferring the transfer, we ensure that the
> >>>timeline is maintained in retirement order if we decide to queue the
> >>>requests onto the hardware in a different order than fifo.
> >>>
> >>>v2: Rebased onto distinct global/user timeline lock classes.
> >>>
> >>>Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
> >>>---
> >>>drivers/gpu/drm/i915/i915_gem_request.c | 31 +++++++++++++++++-------------
> >>>drivers/gpu/drm/i915/i915_gem_request.h | 2 ++
> >>>drivers/gpu/drm/i915/i915_guc_submission.c | 14 +++++++++++++-
> >>>drivers/gpu/drm/i915/intel_lrc.c | 23 +++++++++++++---------
> >>>drivers/gpu/drm/i915/intel_ringbuffer.c | 2 ++
> >>>5 files changed, 49 insertions(+), 23 deletions(-)
> >>>
> >>>diff --git a/drivers/gpu/drm/i915/i915_gem_request.c b/drivers/gpu/drm/i915/i915_gem_request.c
> >>>index e41d51a68ed8..19c29fafb07a 100644
> >>>--- a/drivers/gpu/drm/i915/i915_gem_request.c
> >>>+++ b/drivers/gpu/drm/i915/i915_gem_request.c
> >>>@@ -307,25 +307,16 @@ static u32 timeline_get_seqno(struct i915_gem_timeline *tl)
> >>> return atomic_inc_return(&tl->next_seqno);
> >>>}
> >>>
> >>>-static int __i915_sw_fence_call
> >>>-submit_notify(struct i915_sw_fence *fence, enum i915_sw_fence_notify state)
> >>>+void __i915_gem_request_submit(struct drm_i915_gem_request *request)
> >>>{
> >>>- struct drm_i915_gem_request *request =
> >>>- container_of(fence, typeof(*request), submit);
> >>> struct intel_engine_cs *engine = request->engine;
> >>> struct intel_timeline *timeline;
> >>>- unsigned long flags;
> >>> u32 seqno;
> >>>
> >>>- if (state != FENCE_COMPLETE)
> >>>- return NOTIFY_DONE;
> >>>-
> >>> /* Transfer from per-context onto the global per-engine timeline */
> >>> timeline = engine->timeline;
> >>> GEM_BUG_ON(timeline == request->timeline);
> >>>-
> >>>- /* Will be called from irq-context when using foreign DMA fences */
> >>>- spin_lock_irqsave(&timeline->lock, flags);
> >>>+ assert_spin_locked(&timeline->lock);
> >>>
> >>> seqno = timeline_get_seqno(timeline->common);
> >>> GEM_BUG_ON(!seqno);
> >>>@@ -345,15 +336,29 @@ submit_notify(struct i915_sw_fence *fence, enum i915_sw_fence_notify state)
> >>> GEM_BUG_ON(!request->global_seqno);
> >>> engine->emit_breadcrumb(request,
> >>> request->ring->vaddr + request->postfix);
> >>>- engine->submit_request(request);
> >>>
> >>> spin_lock(&request->timeline->lock);
> >>> list_move_tail(&request->link, &timeline->requests);
> >>> spin_unlock(&request->timeline->lock);
> >>>
> >>> i915_sw_fence_commit(&request->execute);
> >>>+}
> >>>
> >>>- spin_unlock_irqrestore(&timeline->lock, flags);
> >>>+static int __i915_sw_fence_call
> >>>+submit_notify(struct i915_sw_fence *fence, enum i915_sw_fence_notify state)
> >>>+{
> >>>+ if (state == FENCE_COMPLETE) {
> >>>+ struct drm_i915_gem_request *request =
> >>>+ container_of(fence, typeof(*request), submit);
> >>>+ struct intel_engine_cs *engine = request->engine;
> >>>+ unsigned long flags;
> >>>+
> >>>+ /* Will be called from irq-context when using foreign fences. */
> >>>+ spin_lock_irqsave_nested(&engine->timeline->lock, flags,
> >>>+ SINGLE_DEPTH_NESTING);
> >>>+ engine->submit_request(request);
> >>>+ spin_unlock_irqrestore(&engine->timeline->lock, flags);
> >
> >>Would it be cleaner to move the lock taking to engine->submit_request?
> >
> >Perhaps. Certainly pushes the ugliness down a layer!
> >
> >>And is _nested still required? I thought you said it is not. I can't
> >>find signalling under the timeline lock either.
> >
> >It is still required to sort out the ordering between external
> >lockclasses (vgem/nouveau/etc)
>
> Hm, how? I don't see it. :(
vgem was triggering a different path but it was the combination of
swfences causing the lockdep splat.
-Chris
--
Chris Wilson, Intel Open Source Technology Centre
More information about the Intel-gfx
mailing list