[Intel-gfx] [PATCH v3] drm/i915: Lock signaler timeline while navigating
Tvrtko Ursulin
tvrtko.ursulin at linux.intel.com
Wed Sep 18 13:58:24 UTC 2019
On 18/09/2019 14:44, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2019-09-18 14:38:06)
>>
>> On 17/09/2019 16:39, Chris Wilson wrote:
>>> As we need to take a walk back along the signaler timeline to find the
>>> fence before upon which we want to wait, we need to lock that timeline
>>> to prevent it being modified as we walk. Similarly, we also need to
>>> acquire a reference to the earlier fence while it still exists!
>>>
>>> Though we lack the correct locking today, we are saved by the
>>> overarching struct_mutex -- but that protection is being removed.
>>>
>>> v2: Tvrtko made me realise I was being lax and using annotations to
>>> ignore the AB-BA deadlock from the timeline overlap. As it would be
>>> possible to construct a second request that was using a semaphore from the
>>> same timeline as ourselves, we could quite easily end up in a situation
>>> where we deadlocked in our mutex waits. Avoid that by using a trylock
>>> and falling back to a normal dma-fence await if contended.
>>
>> I did not figure out the exact AB-BA, but even on a more basic level
>> without the deadlock, using trylock would mean false positives ie.
>> falling back to software signaling with random mutex contention on the
>> same timeline. From a performance perspective this sounds not end of the
>> world, just unfortunate, but from the design perspective it has me
>> running away scared.
>>
>> I guess the AB-BA would be interdependent requests from two timelines
>> where the direction of dependency switches over across two pairs of
>> submissions.
>
> Exactly.
>
> Oh, you haven't seen the worst of it yet. This is a wonderful mess that
> just keeps on getting worse as you dig in.
>
>>> Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
>>> Cc: Tvrtko Ursulin <tvrtko.ursulin at linux.intel.com>
>>> ---
>>> drivers/gpu/drm/i915/i915_request.c | 56 +++++++++++++++++++----------
>>> 1 file changed, 37 insertions(+), 19 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
>>> index f12358150097..4e861673fe5c 100644
>>> --- a/drivers/gpu/drm/i915/i915_request.c
>>> +++ b/drivers/gpu/drm/i915/i915_request.c
>>> @@ -767,16 +767,35 @@ i915_request_create(struct intel_context *ce)
>>> static int
>>> i915_request_await_start(struct i915_request *rq, struct i915_request *signal)
>>> {
>>> - if (list_is_first(&signal->link, &signal->timeline->requests))
>>> + struct intel_timeline *tl = signal->timeline;
>>> + struct dma_fence *fence;
>>> + int err;
>>> +
>>> + lockdep_assert_held(&rq->timeline->mutex);
>>> + GEM_BUG_ON(rq->timeline == signal->timeline);
>>> +
>>> + if (list_is_first(&signal->link, &tl->requests))
>>> return 0;
>>>
>>> - signal = list_prev_entry(signal, link);
>>> - if (intel_timeline_sync_is_later(rq->timeline, &signal->fence))
>>> + if (!mutex_trylock(&tl->mutex))
>>> + return -EBUSY;
>>> +
>>> + fence = NULL;
>>> + if (!list_is_first(&signal->link, &tl->requests))
>>> + fence = dma_fence_get(&list_prev_entry(signal, link)->fence);
>>> +
>>> + mutex_unlock(&tl->mutex);
>>> + if (!fence)
>>> return 0;
>>>
>>> - return i915_sw_fence_await_dma_fence(&rq->submit,
>>> - &signal->fence, 0,
>>> - I915_FENCE_GFP);
>>> + err = 0;
>>> + if (!intel_timeline_sync_is_later(rq->timeline, fence))
>>> + err = i915_sw_fence_await_dma_fence(&rq->submit,
>>> + fence, 0,
>>> + I915_FENCE_GFP);
>>> + dma_fence_put(fence);
>>> +
>>> + return err;
>>> }
>>>
>>> static intel_engine_mask_t
>>> @@ -804,30 +823,24 @@ emit_semaphore_wait(struct i915_request *to,
>>> {
>>> u32 hwsp_offset;
>>> u32 *cs;
>>> - int err;
>>>
>>> GEM_BUG_ON(!from->timeline->has_initial_breadcrumb);
>>> GEM_BUG_ON(INTEL_GEN(to->i915) < 8);
>>>
>>> /* Just emit the first semaphore we see as request space is limited. */
>>> if (already_busywaiting(to) & from->engine->mask)
>>> - return i915_sw_fence_await_dma_fence(&to->submit,
>>> - &from->fence, 0,
>>> - I915_FENCE_GFP);
>>> + goto await_fence;
>>>
>>> - err = i915_request_await_start(to, from);
>>> - if (err < 0)
>>> - return err;
>>> + if (i915_request_await_start(to, from) < 0)
>>> + goto await_fence;
>>
>> Does this need to be explicitly only on -EBUSY? Otherwise if
>> i915_sw_fence_await_dma_fence fails in i915_request_await_start code
>> jump to do the same i915_sw_fence_await_dma_fence.
>
> The only one that concerned me is ignoring any potential EINTR. All the
> other errors are transient and so trying again with the basic await is a
> valid response (imo). Not bailing out due to a pending signal though is a
> trade-off between our latency and their latency. To be honest, I like
> the simpler code where we just pretend we never noticed the signal
> unless we block again.
Okay, should be safe anyway.
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
Regards,
Tvrtko
More information about the Intel-gfx
mailing list