[PATCH] drm/xe/preempt_fence: enlarge the fence critical section
Matthew Brost
matthew.brost at intel.com
Thu Apr 18 19:55:27 UTC 2024
On Thu, Apr 18, 2024 at 03:46:31PM +0100, Matthew Auld wrote:
> It is really easy to introduce subtle deadlocks in
> preempt_fence_work_func() since we operate on single global ordered-wq
> for signalling our preempt fences behind the scenes, so even though we
> signal a particular fence, everything in the callback should be in the
> fence critical section, since blocking in the callback will prevent
> other published fences from signalling. If we enlarge the fence critical
> section to cover the entire callback, then lockdep should be able to
> understand this better, and complain if we grab a sensitive lock like
> vm->lock, which is also held when waiting on preempt fences.
>
> Signed-off-by: Matthew Auld <matthew.auld at intel.com>
> Cc: Matthew Brost <matthew.brost at intel.com>
Thanks for the patch. Assume lockdep complains if [1] is applied?
With that:
Reviewed-by: Matthew Brost <matthew.brost at intel.com>
[1] https://patchwork.freedesktop.org/series/132571/
> ---
> drivers/gpu/drm/xe/xe_preempt_fence.c | 14 +++++++++++---
> 1 file changed, 11 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/xe/xe_preempt_fence.c b/drivers/gpu/drm/xe/xe_preempt_fence.c
> index 7d50c6e89d8e..5b243b7feb59 100644
> --- a/drivers/gpu/drm/xe/xe_preempt_fence.c
> +++ b/drivers/gpu/drm/xe/xe_preempt_fence.c
> @@ -23,11 +23,19 @@ static void preempt_fence_work_func(struct work_struct *w)
> q->ops->suspend_wait(q);
>
> dma_fence_signal(&pfence->base);
> - dma_fence_end_signalling(cookie);
> -
> + /*
> + * Opt for keep everything in the fence critical section. This looks really strange since we
> + * have just signalled the fence, however the preempt fences are all signalled via single
> + * global ordered-wq, therefore anything that happens in this callback can easily block
> + * progress on the entire wq, which itself may prevent other published preempt fences from
> + * ever signalling. Therefore try to keep everything here in the callback in the fence
> + * critical section. For example if something below grabs a scary lock like vm->lock,
> + * lockdep should complain since we also hold that lock whilst waiting on preempt fences to
> + * complete.
> + */
> xe_vm_queue_rebind_worker(q->vm);
> -
> xe_exec_queue_put(q);
> + dma_fence_end_signalling(cookie);
> }
>
> static const char *
> --
> 2.44.0
>
More information about the Intel-xe
mailing list