[PATCH 1/1] drm/xe/xe_sync: avoid race during ufence signaling
Matthew Brost
matthew.brost at intel.com
Wed Aug 20 00:35:07 UTC 2025
On Tue, Aug 19, 2025 at 08:44:04PM +0200, Zbigniew Kempczyński wrote:
> During vm-bind ioctl ops execute fence may be signaled during the call.
> If vm-bind syncs to user-fence it creates a race because signaling
> happens in the worker. This means control may return from vm-bind
> ioctl and consecutive vm-bind operation to same vma (unmap) may happen
> on still not signaled user-fence. This finally ends with -EBUSY error
> because even if vma operations completed fence still exists but
> userspace was unblocked with copy_to_user() call.
>
> Instead of releasing user-fences in workqueue for already signaled
> ops put them synchronously in the same vm-bind ioctl call.
>
I'm not really following this explaination. I think the actual problem
is in user_fence_worker() the copy to user to done before
WRITE_ONCE(ufence->signalled, 1). If there were re-ordered, I think
that would fix this problem.
Matt
> Fixes: 977e5b82e090 ("drm/xe: Expose user fence from xe_sync_entry")
> Link: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/5536
> Signed-off-by: Zbigniew Kempczyński <zbigniew.kempczynski at intel.com>
> Cc: Matthew Brost <matthew.brost at intel.com>
> Cc: Matthew Auld <matthew.auld at intel.com>
> ---
> drivers/gpu/drm/xe/xe_sync.c | 11 ++++++++++-
> 1 file changed, 10 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/xe/xe_sync.c b/drivers/gpu/drm/xe/xe_sync.c
> index f87276df18f2..8becc3755649 100644
> --- a/drivers/gpu/drm/xe/xe_sync.c
> +++ b/drivers/gpu/drm/xe/xe_sync.c
> @@ -103,6 +103,15 @@ static void kick_ufence(struct xe_user_fence *ufence, struct dma_fence *fence)
> dma_fence_put(fence);
> }
>
> +static void kick_ufence_sync(struct xe_user_fence *ufence, struct dma_fence *fence)
> +{
> + if (copy_to_user(ufence->addr, &ufence->value, sizeof(ufence->value)))
> + XE_WARN_ON("Copy to user failed");
> + WRITE_ONCE(ufence->signalled, 1);
> + user_fence_put(ufence);
> + dma_fence_put(fence);
> +}
> +
> static void user_fence_cb(struct dma_fence *fence, struct dma_fence_cb *cb)
> {
> struct xe_user_fence *ufence = container_of(cb, struct xe_user_fence, cb);
> @@ -244,7 +253,7 @@ void xe_sync_entry_signal(struct xe_sync_entry *sync, struct dma_fence *fence)
> err = dma_fence_add_callback(fence, &sync->ufence->cb,
> user_fence_cb);
> if (err == -ENOENT) {
> - kick_ufence(sync->ufence, fence);
> + kick_ufence_sync(sync->ufence, fence);
> } else if (err) {
> XE_WARN_ON("failed to add user fence");
> user_fence_put(sync->ufence);
> --
> 2.43.0
>
More information about the Intel-xe
mailing list