[Intel-gfx] [PATCH v2 03/11] drm/i915: Defer transfer onto execution timeline to actual hw submission
Tvrtko Ursulin
tvrtko.ursulin at linux.intel.com
Thu Nov 10 11:51:27 UTC 2016
On 10/11/2016 11:11, Chris Wilson wrote:
> On Thu, Nov 10, 2016 at 10:43:29AM +0000, Tvrtko Ursulin wrote:
>>
>> On 07/11/2016 13:59, Chris Wilson wrote:
>>> Defer the transfer from the client's timeline onto the execution
>>> timeline from the point of readiness to the point of actual submission.
>>> For example, in execlists, a request is finally submitted to hardware
>>> when the hardware is ready, and only put onto the hardware queue when
>>> the request is ready. By deferring the transfer, we ensure that the
>>> timeline is maintained in retirement order if we decide to queue the
>>> requests onto the hardware in a different order than fifo.
>>>
>>> v2: Rebased onto distinct global/user timeline lock classes.
>>>
>>> Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
>>> ---
>>> drivers/gpu/drm/i915/i915_gem_request.c | 31 +++++++++++++++++-------------
>>> drivers/gpu/drm/i915/i915_gem_request.h | 2 ++
>>> drivers/gpu/drm/i915/i915_guc_submission.c | 14 +++++++++++++-
>>> drivers/gpu/drm/i915/intel_lrc.c | 23 +++++++++++++---------
>>> drivers/gpu/drm/i915/intel_ringbuffer.c | 2 ++
>>> 5 files changed, 49 insertions(+), 23 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/i915/i915_gem_request.c b/drivers/gpu/drm/i915/i915_gem_request.c
>>> index e41d51a68ed8..19c29fafb07a 100644
>>> --- a/drivers/gpu/drm/i915/i915_gem_request.c
>>> +++ b/drivers/gpu/drm/i915/i915_gem_request.c
>>> @@ -307,25 +307,16 @@ static u32 timeline_get_seqno(struct i915_gem_timeline *tl)
>>> return atomic_inc_return(&tl->next_seqno);
>>> }
>>>
>>> -static int __i915_sw_fence_call
>>> -submit_notify(struct i915_sw_fence *fence, enum i915_sw_fence_notify state)
>>> +void __i915_gem_request_submit(struct drm_i915_gem_request *request)
>>> {
>>> - struct drm_i915_gem_request *request =
>>> - container_of(fence, typeof(*request), submit);
>>> struct intel_engine_cs *engine = request->engine;
>>> struct intel_timeline *timeline;
>>> - unsigned long flags;
>>> u32 seqno;
>>>
>>> - if (state != FENCE_COMPLETE)
>>> - return NOTIFY_DONE;
>>> -
>>> /* Transfer from per-context onto the global per-engine timeline */
>>> timeline = engine->timeline;
>>> GEM_BUG_ON(timeline == request->timeline);
>>> -
>>> - /* Will be called from irq-context when using foreign DMA fences */
>>> - spin_lock_irqsave(&timeline->lock, flags);
>>> + assert_spin_locked(&timeline->lock);
>>>
>>> seqno = timeline_get_seqno(timeline->common);
>>> GEM_BUG_ON(!seqno);
>>> @@ -345,15 +336,29 @@ submit_notify(struct i915_sw_fence *fence, enum i915_sw_fence_notify state)
>>> GEM_BUG_ON(!request->global_seqno);
>>> engine->emit_breadcrumb(request,
>>> request->ring->vaddr + request->postfix);
>>> - engine->submit_request(request);
>>>
>>> spin_lock(&request->timeline->lock);
>>> list_move_tail(&request->link, &timeline->requests);
>>> spin_unlock(&request->timeline->lock);
>>>
>>> i915_sw_fence_commit(&request->execute);
>>> +}
>>>
>>> - spin_unlock_irqrestore(&timeline->lock, flags);
>>> +static int __i915_sw_fence_call
>>> +submit_notify(struct i915_sw_fence *fence, enum i915_sw_fence_notify state)
>>> +{
>>> + if (state == FENCE_COMPLETE) {
>>> + struct drm_i915_gem_request *request =
>>> + container_of(fence, typeof(*request), submit);
>>> + struct intel_engine_cs *engine = request->engine;
>>> + unsigned long flags;
>>> +
>>> + /* Will be called from irq-context when using foreign fences. */
>>> + spin_lock_irqsave_nested(&engine->timeline->lock, flags,
>>> + SINGLE_DEPTH_NESTING);
>>> + engine->submit_request(request);
>>> + spin_unlock_irqrestore(&engine->timeline->lock, flags);
>
>> Would it be cleaner to move the lock taking to engine->submit_request?
>
> Perhaps. Certainly pushes the ugliness down a layer!
>
>> And is _nested still required? I thought you said it is not. I can't
>> find signalling under the timeline lock either.
>
> It is still required to sort out the ordering between external
> lockclasses (vgem/nouveau/etc)
Hm, how? I don't see it. :(
Regards,
Tvrtko
More information about the Intel-gfx
mailing list