[Intel-gfx] [PATCH] drm/i915: Handle Intel igfx + Intel dgfx hybrid graphics setup

Tue Oct 5 13:05:25 UTC 2021

Hi, Tvrtko,

On 10/5/21 13:31, Tvrtko Ursulin wrote:
> From: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
>
> In short this makes i915 work for hybrid setups (DRI_PRIME=1 with Mesa)
> when rendering is done on Intel dgfx and scanout/composition on Intel
> igfx.
>
> Before this patch the driver was not quite ready for that setup, mainly
> because it was able to emit a semaphore wait between the two GPUs, which
> results in deadlocks because semaphore target location in HWSP is neither
> shared between the two, nor mapped in both GGTT spaces.
>
> To fix it the patch adds an additional check to a couple of relevant code
> paths in order to prevent using semaphores for inter-engine
> synchronisation when relevant objects are not in the same GGTT space.
>
> v2:
>   * Avoid adding rq->i915. (Chris)
>
> v3:
>   * Use GGTT which describes the limit more precisely.
>
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
> Cc: Daniel Vetter <daniel.vetter at ffwll.ch>
> Cc: Matthew Auld <matthew.auld at intel.com>
> Cc: Thomas Hellström <thomas.hellstrom at linux.intel.com>

An IMO pretty important bugfix. I read up a bit on the previous 
discussion on this, and from what I understand the other two options were

1) Ripping out the semaphore code,
2) Consider dma-fences from other instances of the same driver as foreign.

For imported dma-bufs we do 2), but particularly with lmem and p2p 
that's a more straightforward decision.

I don't think 1) is a reasonable approach to fix this bug, (but perhaps 
as a general cleanup?), and for 2) yes I guess we might end up doing 
that, unless we find some real benefits in treating 
same-driver-separate-device dma-fences as local, but for this particular 
bug, IMO this is a reasonable fix.

So,

Reviewed-by: Thomas Hellström <thomas.hellstrom at linux.intel.com>

> ---
>   drivers/gpu/drm/i915/i915_request.c | 12 +++++++++++-
>   1 file changed, 11 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
> index 79da5eca60af..4f189982f67e 100644
> --- a/drivers/gpu/drm/i915/i915_request.c
> +++ b/drivers/gpu/drm/i915/i915_request.c
> @@ -1145,6 +1145,12 @@ __emit_semaphore_wait(struct i915_request *to,
>   	return 0;
>   }
>   
> +static bool
> +can_use_semaphore_wait(struct i915_request *to, struct i915_request *from)
> +{
> +	return to->engine->gt->ggtt == from->engine->gt->ggtt;
> +}
> +
>   static int
>   emit_semaphore_wait(struct i915_request *to,
>   		    struct i915_request *from,
> @@ -1153,6 +1159,9 @@ emit_semaphore_wait(struct i915_request *to,
>   	const intel_engine_mask_t mask = READ_ONCE(from->engine)->mask;
>   	struct i915_sw_fence *wait = &to->submit;
>   
> +	if (!can_use_semaphore_wait(to, from))
> +		goto await_fence;
> +
>   	if (!intel_context_use_semaphores(to->context))
>   		goto await_fence;
>   
> @@ -1256,7 +1265,8 @@ __i915_request_await_execution(struct i915_request *to,
>   	 * immediate execution, and so we must wait until it reaches the
>   	 * active slot.
>   	 */
> -	if (intel_engine_has_semaphores(to->engine) &&
> +	if (can_use_semaphore_wait(to, from) &&
> +	    intel_engine_has_semaphores(to->engine) &&
>   	    !i915_request_has_initial_breadcrumb(to)) {
>   		err = __emit_semaphore_wait(to, from, from->fence.seqno - 1);
>   		if (err < 0)