<!DOCTYPE html><html><head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
  </head>
  <body>
    Am 22.04.25 um 15:09 schrieb Liang, Prike:<br>
    <blockquote type="cite" cite="mid:DS7PR12MB60054D6296012455E53B72E3FBBB2@DS7PR12MB6005.namprd12.prod.outlook.com"><span style="white-space: pre-wrap">
</span>
      <blockquote type="cite">
        <blockquote type="cite">
          <pre class="moz-quote-pre" wrap="">Stop, wait a second. That shouldn't happen at the first place. Why is the eviction
</pre>
        </blockquote>
        <pre class="moz-quote-pre" wrap="">fence considered a dependency for page table updates?
</pre>
        <blockquote type="cite">
          <pre class="moz-quote-pre" wrap="">
When it is added only as bookkeep then we should never consider that here.
</pre>
        </blockquote>
        <pre class="moz-quote-pre" wrap="">Looks like something in the sync obj is messed up.
It is like this. Here, amdgpu_sync_resv is using
DMA_RESV_USAGE_BOOKKEEP.

         int amdgpu_sync_resv() {

                 ..

                 /* TODO: Use DMA_RESV_USAGE_READ here */</pre>
      </blockquote>
    </blockquote>
    <br>
    That here is the core of the problem.<br>
    <br>
    I've added this TODO item 4 years ago to switch over to
    DMA_RESV_USAGE_READ here when moved all TTM use cases to using
    drm_sched_job_add_resv_dependencies().<br>
    <br>
    That was done, but this TODO here forgotten.<br>
    <br>
    <blockquote type="cite" cite="mid:DS7PR12MB60054D6296012455E53B72E3FBBB2@DS7PR12MB6005.namprd12.prod.outlook.com">
      <blockquote type="cite">
        <pre class="moz-quote-pre" wrap="">
                 dma_resv_for_each_fence(&cursor, resv,
DMA_RESV_USAGE_BOOKKEEP, f) {
                     dma_fence_chain_for_each(f, f) {

                 ..

}
during PT update amdgpu_vm_bo_update() is using sync to moving
fences(Eviction fence) before mapping anything. Because of this Eviction fence will
act as dependency.
</pre>
      </blockquote>
      <pre class="moz-quote-pre" wrap="">
Yes, since the amdgpu_sync_resv() uses the bookkeep usage, then all the BOs reservation fences along with the eviction fence will be returned and added to the sync.</pre>
    </blockquote>
    <br>
    Yeah, but that is incorrect.<br>
    <br>
    <blockquote type="cite" cite="mid:DS7PR12MB60054D6296012455E53B72E3FBBB2@DS7PR12MB6005.namprd12.prod.outlook.com">
      <pre class="moz-quote-pre" wrap="">And with the attached patch, the eviction fence can be released properly when the kq and uq are enabled.</pre>
    </blockquote>
    <br>
    We need to fix the underlying bug first before we can work on the
    next step.<br>
    <br>
    Regards,<br>
    Christian.<br>
    <br>
    <blockquote type="cite" cite="mid:DS7PR12MB60054D6296012455E53B72E3FBBB2@DS7PR12MB6005.namprd12.prod.outlook.com">
      <pre class="moz-quote-pre" wrap="">

Thanks,
Prike

</pre>
      <blockquote type="cite">
        <pre class="moz-quote-pre" wrap="">~arvind

</pre>
        <blockquote type="cite">
          <pre class="moz-quote-pre" wrap="">Regards,
Christian.

</pre>
          <blockquote type="cite">
            <pre class="moz-quote-pre" wrap="">, and then the eviction fence will be added as a dependent fence by
</pre>
          </blockquote>
        </blockquote>
        <pre class="moz-quote-pre" wrap="">propagating with amdgpu_sync_push_to_job(). With removing the eviction fence
from the VM sync at amdgpu_sync_resv(), then the eviction fence can be released
properly.
</pre>
        <blockquote type="cite">
          <blockquote type="cite">
            <pre class="moz-quote-pre" wrap="">
Thanks,
Prike
</pre>
            <blockquote type="cite">
              <pre class="moz-quote-pre" wrap="">Thanks,
Christian.

PS: Please stop calling the eviction fence evf.

</pre>
              <blockquote type="cite">
                <blockquote type="cite">
                  <blockquote type="cite">
                    <pre class="moz-quote-pre" wrap="">     return 0;

  free_err:
@@ -237,6 +234,30 @@ void amdgpu_eviction_fence_detach(struct
</pre>
                  </blockquote>
                  <pre class="moz-quote-pre" wrap="">amdgpu_eviction_fence_mgr *evf_mgr,
</pre>
                  <blockquote type="cite">
                    <pre class="moz-quote-pre" wrap="">     dma_fence_put(stub);
  }

+void amdgpu_userq_remove_all_eviction_fences(struct amdgpu_bo
+*bo)
</pre>
                  </blockquote>
                  <pre class="moz-quote-pre" wrap="">Please name that amdgpu_eviction_fence_remove_all().
</pre>
                </blockquote>
                <pre class="moz-quote-pre" wrap="">Noted.

</pre>
                <blockquote type="cite">
                  <pre class="moz-quote-pre" wrap="">Regards,
Christian.

</pre>
                  <blockquote type="cite">
                    <pre class="moz-quote-pre" wrap="">+{
+   struct dma_resv *resv = &bo->tbo.base._resv;
+   struct dma_fence *fence, *stub;
+   struct dma_resv_iter cursor;
+
+   dma_resv_assert_held(resv);
+
+   stub = dma_fence_get_stub();
+   dma_resv_for_each_fence(&cursor, resv,
</pre>
                  </blockquote>
                  <pre class="moz-quote-pre" wrap="">DMA_RESV_USAGE_BOOKKEEP, fence) {
</pre>
                  <blockquote type="cite">
                    <pre class="moz-quote-pre" wrap="">+           struct amdgpu_eviction_fence *ev_fence;
+
+           ev_fence = fence_to_evf(fence);
+           if (!ev_fence || !dma_fence_is_signaled(&ev_fence->base))
+                   continue;
+
+           dma_resv_replace_fences(resv, fence->context, stub,
+                           DMA_RESV_USAGE_BOOKKEEP);
+
+   }
+
+   dma_fence_put(stub);
+}
+
  int amdgpu_eviction_fence_init(struct amdgpu_eviction_fence_mgr
*evf_mgr)  {
     /* This needs to be done one time per open */ diff --git
a/drivers/gpu/drm/amd/amdgpu/amdgpu_eviction_fence.h
b/drivers/gpu/drm/amd/amdgpu/amdgpu_eviction_fence.h
index fcd867b7147d..da99ac322a2e 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_eviction_fence.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_eviction_fence.h
@@ -66,4 +66,5 @@ amdgpu_eviction_fence_signal(struct
amdgpu_eviction_fence_mgr *evf_mgr,  int
amdgpu_eviction_fence_replace_fence(struct
amdgpu_eviction_fence_mgr
</pre>
                  </blockquote>
                  <pre class="moz-quote-pre" wrap="">*evf_mgr,
</pre>
                  <blockquote type="cite">
                    <pre class="moz-quote-pre" wrap="">                                 struct drm_exec *exec);
+void amdgpu_userq_remove_all_eviction_fences(struct amdgpu_bo
+*bo);
  #endif
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
index 1e73ce30d4d7..f001018a01eb 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
@@ -1392,6 +1392,7 @@ void amdgpu_bo_release_notify(struct
</pre>
                  </blockquote>
                  <pre class="moz-quote-pre" wrap="">ttm_buffer_object *bo)
</pre>
                  <blockquote type="cite">
                    <pre class="moz-quote-pre" wrap="">     amdgpu_vram_mgr_set_cleared(bo->resource);
     dma_resv_add_fence(&bo->base._resv, fence,
</pre>
                  </blockquote>
                  <pre class="moz-quote-pre" wrap="">DMA_RESV_USAGE_KERNEL);
</pre>
                  <blockquote type="cite">
                    <pre class="moz-quote-pre" wrap="">     dma_fence_put(fence);
+   amdgpu_userq_remove_all_eviction_fences(abo);

  out:
     dma_resv_unlock(&bo->base._resv);
</pre>
                  </blockquote>
                </blockquote>
              </blockquote>
            </blockquote>
          </blockquote>
        </blockquote>
      </blockquote>
    </blockquote>
    <br>
  </body>
</html>