[Intel-gfx] [PATCH v3] drm/i915: Lock signaler timeline while navigating
Tvrtko Ursulin
tvrtko.ursulin at linux.intel.com
Wed Sep 18 13:38:06 UTC 2019
On 17/09/2019 16:39, Chris Wilson wrote:
> As we need to take a walk back along the signaler timeline to find the
> fence before upon which we want to wait, we need to lock that timeline
> to prevent it being modified as we walk. Similarly, we also need to
> acquire a reference to the earlier fence while it still exists!
>
> Though we lack the correct locking today, we are saved by the
> overarching struct_mutex -- but that protection is being removed.
>
> v2: Tvrtko made me realise I was being lax and using annotations to
> ignore the AB-BA deadlock from the timeline overlap. As it would be
> possible to construct a second request that was using a semaphore from the
> same timeline as ourselves, we could quite easily end up in a situation
> where we deadlocked in our mutex waits. Avoid that by using a trylock
> and falling back to a normal dma-fence await if contended.
I did not figure out the exact AB-BA, but even on a more basic level
without the deadlock, using trylock would mean false positives ie.
falling back to software signaling with random mutex contention on the
same timeline. From a performance perspective this sounds not end of the
world, just unfortunate, but from the design perspective it has me
running away scared.
I guess the AB-BA would be interdependent requests from two timelines
where the direction of dependency switches over across two pairs of
submissions.
>
> Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
> Cc: Tvrtko Ursulin <tvrtko.ursulin at linux.intel.com>
> ---
> drivers/gpu/drm/i915/i915_request.c | 56 +++++++++++++++++++----------
> 1 file changed, 37 insertions(+), 19 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
> index f12358150097..4e861673fe5c 100644
> --- a/drivers/gpu/drm/i915/i915_request.c
> +++ b/drivers/gpu/drm/i915/i915_request.c
> @@ -767,16 +767,35 @@ i915_request_create(struct intel_context *ce)
> static int
> i915_request_await_start(struct i915_request *rq, struct i915_request *signal)
> {
> - if (list_is_first(&signal->link, &signal->timeline->requests))
> + struct intel_timeline *tl = signal->timeline;
> + struct dma_fence *fence;
> + int err;
> +
> + lockdep_assert_held(&rq->timeline->mutex);
> + GEM_BUG_ON(rq->timeline == signal->timeline);
> +
> + if (list_is_first(&signal->link, &tl->requests))
> return 0;
>
> - signal = list_prev_entry(signal, link);
> - if (intel_timeline_sync_is_later(rq->timeline, &signal->fence))
> + if (!mutex_trylock(&tl->mutex))
> + return -EBUSY;
> +
> + fence = NULL;
> + if (!list_is_first(&signal->link, &tl->requests))
> + fence = dma_fence_get(&list_prev_entry(signal, link)->fence);
> +
> + mutex_unlock(&tl->mutex);
> + if (!fence)
> return 0;
>
> - return i915_sw_fence_await_dma_fence(&rq->submit,
> - &signal->fence, 0,
> - I915_FENCE_GFP);
> + err = 0;
> + if (!intel_timeline_sync_is_later(rq->timeline, fence))
> + err = i915_sw_fence_await_dma_fence(&rq->submit,
> + fence, 0,
> + I915_FENCE_GFP);
> + dma_fence_put(fence);
> +
> + return err;
> }
>
> static intel_engine_mask_t
> @@ -804,30 +823,24 @@ emit_semaphore_wait(struct i915_request *to,
> {
> u32 hwsp_offset;
> u32 *cs;
> - int err;
>
> GEM_BUG_ON(!from->timeline->has_initial_breadcrumb);
> GEM_BUG_ON(INTEL_GEN(to->i915) < 8);
>
> /* Just emit the first semaphore we see as request space is limited. */
> if (already_busywaiting(to) & from->engine->mask)
> - return i915_sw_fence_await_dma_fence(&to->submit,
> - &from->fence, 0,
> - I915_FENCE_GFP);
> + goto await_fence;
>
> - err = i915_request_await_start(to, from);
> - if (err < 0)
> - return err;
> + if (i915_request_await_start(to, from) < 0)
> + goto await_fence;
Does this need to be explicitly only on -EBUSY? Otherwise if
i915_sw_fence_await_dma_fence fails in i915_request_await_start code
jump to do the same i915_sw_fence_await_dma_fence.
>
> /* Only submit our spinner after the signaler is running! */
> - err = __i915_request_await_execution(to, from, NULL, gfp);
> - if (err)
> - return err;
> + if (__i915_request_await_execution(to, from, NULL, gfp))
> + goto await_fence;
>
> /* We need to pin the signaler's HWSP until we are finished reading. */
> - err = intel_timeline_read_hwsp(from, to, &hwsp_offset);
> - if (err)
> - return err;
> + if (intel_timeline_read_hwsp(from, to, &hwsp_offset))
> + goto await_fence;
>
> cs = intel_ring_begin(to, 4);
> if (IS_ERR(cs))
> @@ -853,6 +866,11 @@ emit_semaphore_wait(struct i915_request *to,
> to->sched.semaphores |= from->engine->mask;
> to->sched.flags |= I915_SCHED_HAS_SEMAPHORE_CHAIN;
> return 0;
> +
> +await_fence:
> + return i915_sw_fence_await_dma_fence(&to->submit,
> + &from->fence, 0,
> + I915_FENCE_GFP);
> }
>
> static int
>
Regards,
Tvrtko
More information about the Intel-gfx
mailing list