[Intel-gfx] [PATCH 02/12] drm/i915: Defer transfer onto execution timeline to actual hw submission
Chris Wilson
chris at chris-wilson.co.uk
Thu Nov 3 10:51:14 UTC 2016
On Thu, Nov 03, 2016 at 10:34:26AM +0000, Tvrtko Ursulin wrote:
>
> On 02/11/2016 17:50, Chris Wilson wrote:
> >Defer the transfer from the client's timeline onto the execution
> >timeline from the point of readiness to the point of actual submission.
> >For example, in execlists, a request is finally submitted to hardware
> >when the hardware is ready, and only put onto the hardware queue when
> >the request is ready. By deferring the transfer, we ensure that the
> >timeline is maintained in retirement order if we decide to queue the
> >requests onto the hardware in a different order than fifo.
> >
> >Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
> >---
> > drivers/gpu/drm/i915/i915_gem_request.c | 36 ++++++++++++++++++------------
> > drivers/gpu/drm/i915/i915_gem_request.h | 2 ++
> > drivers/gpu/drm/i915/i915_guc_submission.c | 2 ++
> > drivers/gpu/drm/i915/intel_lrc.c | 5 +++++
> > drivers/gpu/drm/i915/intel_ringbuffer.c | 2 ++
> > 5 files changed, 33 insertions(+), 14 deletions(-)
> >
> >diff --git a/drivers/gpu/drm/i915/i915_gem_request.c b/drivers/gpu/drm/i915/i915_gem_request.c
> >index 1ae5a2f8953f..05544dec5de9 100644
> >--- a/drivers/gpu/drm/i915/i915_gem_request.c
> >+++ b/drivers/gpu/drm/i915/i915_gem_request.c
> >@@ -307,25 +307,16 @@ static u32 timeline_get_seqno(struct i915_gem_timeline *tl)
> > return atomic_inc_return(&tl->next_seqno);
> > }
> >
> >-static int __i915_sw_fence_call
> >-submit_notify(struct i915_sw_fence *fence, enum i915_sw_fence_notify state)
> >+void __i915_gem_request_submit(struct drm_i915_gem_request *request)
> > {
> >- struct drm_i915_gem_request *request =
> >- container_of(fence, typeof(*request), submit);
> > struct intel_engine_cs *engine = request->engine;
> > struct intel_timeline *timeline;
> >- unsigned long flags;
> > u32 seqno;
> >
> >- if (state != FENCE_COMPLETE)
> >- return NOTIFY_DONE;
> >-
> > /* Transfer from per-context onto the global per-engine timeline */
> > timeline = engine->timeline;
> > GEM_BUG_ON(timeline == request->timeline);
> >-
> >- /* Will be called from irq-context when using foreign DMA fences */
> >- spin_lock_irqsave(&timeline->lock, flags);
> >+ assert_spin_locked(&timeline->lock);
> >
> > seqno = timeline_get_seqno(timeline->common);
> > GEM_BUG_ON(!seqno);
> >@@ -345,15 +336,32 @@ submit_notify(struct i915_sw_fence *fence, enum i915_sw_fence_notify state)
> > GEM_BUG_ON(!request->global_seqno);
> > engine->emit_breadcrumb(request,
> > request->ring->vaddr + request->postfix);
> >- engine->submit_request(request);
> >
> >- spin_lock_nested(&request->timeline->lock, SINGLE_DEPTH_NESTING);
> >+ /* nested from submit_notify */
> >+ spin_lock_nested(&request->timeline->lock, 2);
>
> Is this because lockdep does not differentiate between
> request->timeline->lock and engine->timeline->lock?
Yes. Worse is that the 2 comes from having different paths to this point
with their own nesting pattern.
struct timeline {
struct lock_class_key lockdep;
struct mutex mutex;
}
__mutex_init(&timeline->mutex, "timeline", &timeline->lockdep);
? (Hopefully that works w/o lockdep enabled)
Better then would be
i915_gem_timeline_create()
i915_gem_timeline_create_global()
and just use one lock class for all user timelines, and a lock class per
engine on the global timeline.
Will try.
-Chris
--
Chris Wilson, Intel Open Source Technology Centre
More information about the Intel-gfx
mailing list