[PATCH] drm/xe/preempt_fence: enlarge the fence critical section

Matthew Brost matthew.brost at intel.com
Thu Apr 18 19:55:27 UTC 2024


On Thu, Apr 18, 2024 at 03:46:31PM +0100, Matthew Auld wrote:
> It is really easy to introduce subtle deadlocks in
> preempt_fence_work_func() since we operate on single global ordered-wq
> for signalling our preempt fences behind the scenes, so even though we
> signal a particular fence, everything in the callback should be in the
> fence critical section, since blocking in the callback will prevent
> other published fences from signalling. If we enlarge the fence critical
> section to cover the entire callback, then lockdep should be able to
> understand this better, and complain if we grab a sensitive lock like
> vm->lock, which is also held when waiting on preempt fences.
> 
> Signed-off-by: Matthew Auld <matthew.auld at intel.com>
> Cc: Matthew Brost <matthew.brost at intel.com>

Thanks for the patch. Assume lockdep complains if [1] is applied?

With that:
Reviewed-by: Matthew Brost <matthew.brost at intel.com>

[1] https://patchwork.freedesktop.org/series/132571/

> ---
>  drivers/gpu/drm/xe/xe_preempt_fence.c | 14 +++++++++++---
>  1 file changed, 11 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/gpu/drm/xe/xe_preempt_fence.c b/drivers/gpu/drm/xe/xe_preempt_fence.c
> index 7d50c6e89d8e..5b243b7feb59 100644
> --- a/drivers/gpu/drm/xe/xe_preempt_fence.c
> +++ b/drivers/gpu/drm/xe/xe_preempt_fence.c
> @@ -23,11 +23,19 @@ static void preempt_fence_work_func(struct work_struct *w)
>  		q->ops->suspend_wait(q);
>  
>  	dma_fence_signal(&pfence->base);
> -	dma_fence_end_signalling(cookie);
> -
> +	/*
> +	 * Opt for keep everything in the fence critical section. This looks really strange since we
> +	 * have just signalled the fence, however the preempt fences are all signalled via single
> +	 * global ordered-wq, therefore anything that happens in this callback can easily block
> +	 * progress on the entire wq, which itself may prevent other published preempt fences from
> +	 * ever signalling.  Therefore try to keep everything here in the callback in the fence
> +	 * critical section. For example if something below grabs a scary lock like vm->lock,
> +	 * lockdep should complain since we also hold that lock whilst waiting on preempt fences to
> +	 * complete.
> +	 */
>  	xe_vm_queue_rebind_worker(q->vm);
> -
>  	xe_exec_queue_put(q);
> +	dma_fence_end_signalling(cookie);
>  }
>  
>  static const char *
> -- 
> 2.44.0
> 


More information about the Intel-xe mailing list