<html><head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
  </head>
  <body>
    <p><br>
    </p>
    <div class="moz-cite-prefix">On 2024-01-02 15:08, Felix Kuehling
      wrote:<br>
    </div>
    <blockquote type="cite" cite="mid:20240102200804.68030-1-felix.kuehling@amd.com">
      <pre class="moz-quote-pre" wrap="">======================================================
WARNING: possible circular locking dependency detected
6.5.0-kfd-fkuehlin #276 Not tainted
------------------------------------------------------
kworker/8:2/2676 is trying to acquire lock:
ffff9435aae95c88 ((work_completion)(&svm_bo->eviction_work)){+.+.}-{0:0}, at: __flush_work+0x52/0x550

but task is already holding lock:
ffff9435cd8e1720 (&svms->lock){+.+.}-{3:3}, at: svm_range_deferred_list_work+0xe8/0x340 [amdgpu]

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> #2 (&svms->lock){+.+.}-{3:3}:
       __mutex_lock+0x97/0xd30
       kfd_ioctl_alloc_memory_of_gpu+0x6d/0x3c0 [amdgpu]
       kfd_ioctl+0x1b2/0x5d0 [amdgpu]
       __x64_sys_ioctl+0x86/0xc0
       do_syscall_64+0x39/0x80
       entry_SYSCALL_64_after_hwframe+0x63/0xcd

-> #1 (&mm->mmap_lock){++++}-{3:3}:
       down_read+0x42/0x160
       svm_range_evict_svm_bo_worker+0x8b/0x340 [amdgpu]
       process_one_work+0x27a/0x540
       worker_thread+0x53/0x3e0
       kthread+0xeb/0x120
       ret_from_fork+0x31/0x50
       ret_from_fork_asm+0x11/0x20

-> #0 ((work_completion)(&svm_bo->eviction_work)){+.+.}-{0:0}:
       __lock_acquire+0x1426/0x2200
       lock_acquire+0xc1/0x2b0
       __flush_work+0x80/0x550
       __cancel_work_timer+0x109/0x190
       svm_range_bo_release+0xdc/0x1c0 [amdgpu]
       svm_range_free+0x175/0x180 [amdgpu]
       svm_range_deferred_list_work+0x15d/0x340 [amdgpu]
       process_one_work+0x27a/0x540
       worker_thread+0x53/0x3e0
       kthread+0xeb/0x120
       ret_from_fork+0x31/0x50
       ret_from_fork_asm+0x11/0x20

other info that might help us debug this:

Chain exists of:
  (work_completion)(&svm_bo->eviction_work) --> &mm->mmap_lock --> &svms->lock

 Possible unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock(&svms->lock);
                               lock(&mm->mmap_lock);
                               lock(&svms->lock);
  lock((work_completion)(&svm_bo->eviction_work));

I believe this cannot really lead to a deadlock in practice, because
svm_range_evict_svm_bo_worker only takes the mmap_read_lock if the BO
refcount is non-0. That means it's impossible that svm_range_bo_release
is running concurrently. However, there is no good way to annotate this.

To avoid the problem, take a BO reference in
svm_range_schedule_evict_svm_bo instead of in the worker. That way it's
impossible for a BO to get freed while eviction work is pending and the
cancel_work_sync call in svm_range_bo_release can be eliminated.

v2: Use svm_bo_ref_unless_zero and explained why that's safe. Also
removed redundant checks that are already done in
amdkfd_fence_enable_signaling.

Signed-off-by: Felix Kuehling <a class="moz-txt-link-rfc2396E" href="mailto:felix.kuehling@amd.com"><felix.kuehling@amd.com></a></pre>
    </blockquote>
    Reviewed-by: Philip Yang <a class="moz-txt-link-rfc2396E" href="mailto:philip.yang@amd.com"><philip.yang@amd.com></a><br>
    <blockquote type="cite" cite="mid:20240102200804.68030-1-felix.kuehling@amd.com">
      <pre class="moz-quote-pre" wrap="">
---
 drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 26 ++++++++++----------------
 1 file changed, 10 insertions(+), 16 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
index afd98aace065..6b314e4d3ae0 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
@@ -404,14 +404,9 @@ static void svm_range_bo_release(struct kref *kref)
                spin_lock(&svm_bo->list_lock);
        }
        spin_unlock(&svm_bo->list_lock);
-       if (!dma_fence_is_signaled(&svm_bo->eviction_fence->base)) {
-               /* We're not in the eviction worker.
-                * Signal the fence and synchronize with any
-                * pending eviction work.
-                */
+       if (!dma_fence_is_signaled(&svm_bo->eviction_fence->base))
+               /* We're not in the eviction worker. Signal the fence. */
                dma_fence_signal(&svm_bo->eviction_fence->base);
-               cancel_work_sync(&svm_bo->eviction_work);
-       }
        dma_fence_put(&svm_bo->eviction_fence->base);
        amdgpu_bo_unref(&svm_bo->bo);
        kfree(svm_bo);
@@ -3437,13 +3432,14 @@ svm_range_trigger_migration(struct mm_struct *mm, struct svm_range *prange,
 
 int svm_range_schedule_evict_svm_bo(struct amdgpu_amdkfd_fence *fence)
 {
-       if (!fence)
-               return -EINVAL;
-
-       if (dma_fence_is_signaled(&fence->base))
-               return 0;
-
-       if (fence->svm_bo) {
+       /* Dereferencing fence->svm_bo is safe here because the fence hasn't
+        * signaled yet and we're under the protection of the fence->lock.
+        * After the fence is signaled in svm_range_bo_release, we cannot get
+        * here any more.
+        *
+        * Reference is dropped in svm_range_evict_svm_bo_worker.
+        */
+       if (svm_bo_ref_unless_zero(fence->svm_bo)) {
                WRITE_ONCE(fence->svm_bo->evicting, 1);
                schedule_work(&fence->svm_bo->eviction_work);
        }
@@ -3458,8 +3454,6 @@ static void svm_range_evict_svm_bo_worker(struct work_struct *work)
        int r = 0;
 
        svm_bo = container_of(work, struct svm_range_bo, eviction_work);
-       if (!svm_bo_ref_unless_zero(svm_bo))
-               return; /* svm_bo was freed while eviction was pending */
 
        if (mmget_not_zero(svm_bo->eviction_fence->mm)) {
                mm = svm_bo->eviction_fence->mm;
</pre>
    </blockquote>
  </body>
</html>