[Intel-gfx] [PATCH 3/9] drm/i915: Prevent using semaphores to chain up to external fences
Mika Kuoppala
mika.kuoppala at linux.intel.com
Fri May 8 15:37:15 UTC 2020
Chris Wilson <chris at chris-wilson.co.uk> writes:
> The downside of using semaphores is that we lose metadata passing
> along the signaling chain. This is particularly nasty when we
> need to pass along a fatal error such as EFAULT or EDEADLK. For
> fatal errors we want to scrub the request before it is executed,
> which means that we cannot preload the request onto HW and have
> it wait upon a semaphore.
b is waiting on a, a fails and we want to release b with error?
>
> Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
> ---
> drivers/gpu/drm/i915/i915_request.c | 26 +++++++++++++++++++++
> drivers/gpu/drm/i915/i915_scheduler_types.h | 1 +
> 2 files changed, 27 insertions(+)
>
> diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
> index 94189c7d43cd..f0f9393e2ade 100644
> --- a/drivers/gpu/drm/i915/i915_request.c
> +++ b/drivers/gpu/drm/i915/i915_request.c
> @@ -1002,6 +1002,15 @@ emit_semaphore_wait(struct i915_request *to,
> if (!rcu_access_pointer(from->hwsp_cacheline))
> goto await_fence;
>
> + /*
> + * If this or its dependents are waiting on an external fence
> + * that may fail catastrophically, then we want to avoid using
> + * sempahores as they bypass the fence signaling metadata, and we
semaphore
-Mika
> + * lose the fence->error propagation.
> + */
> + if (from->sched.flags & I915_SCHED_HAS_EXTERNAL_CHAIN)
> + goto await_fence;
> +
> /* Just emit the first semaphore we see as request space is limited. */
> if (already_busywaiting(to) & mask)
> goto await_fence;
> @@ -1064,12 +1073,29 @@ i915_request_await_request(struct i915_request *to, struct i915_request *from)
> return ret;
> }
>
> + if (from->sched.flags & I915_SCHED_HAS_EXTERNAL_CHAIN)
> + to->sched.flags |= I915_SCHED_HAS_EXTERNAL_CHAIN;
> +
> return 0;
> }
>
> +static void mark_external(struct i915_request *rq)
> +{
> + /*
> + * The downside of using semaphores is that we lose metadata passing
> + * along the signaling chain. This is particularly nasty when we
> + * need to pass along a fatal error such as EFAULT or EDEADLK. For
> + * fatal errors we want to scrub the request before it is executed,
> + * which means that we cannot preload the request onto HW and have
> + * it wait upon a semaphore.
> + */
> + rq->sched.flags |= I915_SCHED_HAS_EXTERNAL_CHAIN;
> +}
> +
> static int
> i915_request_await_external(struct i915_request *rq, struct dma_fence *fence)
> {
> + mark_external(rq);
> return i915_sw_fence_await_dma_fence(&rq->submit, fence,
> fence->context ? I915_FENCE_TIMEOUT : 0,
> I915_FENCE_GFP);
> diff --git a/drivers/gpu/drm/i915/i915_scheduler_types.h b/drivers/gpu/drm/i915/i915_scheduler_types.h
> index 7186875088a0..6ab2c5289bed 100644
> --- a/drivers/gpu/drm/i915/i915_scheduler_types.h
> +++ b/drivers/gpu/drm/i915/i915_scheduler_types.h
> @@ -66,6 +66,7 @@ struct i915_sched_node {
> struct i915_sched_attr attr;
> unsigned int flags;
> #define I915_SCHED_HAS_SEMAPHORE_CHAIN BIT(0)
> +#define I915_SCHED_HAS_EXTERNAL_CHAIN BIT(1)
> intel_engine_mask_t semaphores;
> };
>
> --
> 2.20.1
More information about the Intel-gfx
mailing list