<!DOCTYPE html><html><head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
</head>
<body>
<div class="moz-cite-prefix">On 2023-12-11 10:56, Felix Kuehling
wrote:<br>
</div>
<blockquote type="cite" cite="mid:7b67d4f1-cfcd-47a9-a80e-f4c1eee235a1@amd.com">
<br>
On 2023-12-08 05:11, Christian König wrote:
<br>
<blockquote type="cite">Am 07.12.23 um 20:14 schrieb Felix
Kuehling:
<br>
<blockquote type="cite">
<br>
On 2023-12-05 17:20, Felix Kuehling wrote:
<br>
<blockquote type="cite">Properly mark kfd_process->ef as
__rcu and consistently access it with
<br>
rcu_dereference_protected.
<br>
<br>
Reported-by: kernel test robot <a class="moz-txt-link-rfc2396E" href="mailto:lkp@intel.com"><lkp@intel.com></a>
<br>
Closes:
<a class="moz-txt-link-freetext" href="https://lore.kernel.org/oe-kbuild-all/202312052245.yFpBSgNH-lkp@intel.com/">https://lore.kernel.org/oe-kbuild-all/202312052245.yFpBSgNH-lkp@intel.com/</a><br>
Signed-off-by: Felix Kuehling <a class="moz-txt-link-rfc2396E" href="mailto:Felix.Kuehling@amd.com"><Felix.Kuehling@amd.com></a>
<br>
</blockquote>
<br>
ping.
<br>
<br>
Christian, would you review this patch, please?
<br>
</blockquote>
<br>
Looks a bit suspicious, especially the
rcu_dereference_protected() use.
<br>
<br>
What is the static checker complaining about in the first place?
<br>
</blockquote>
From
<a class="moz-txt-link-freetext" href="https://lore.kernel.org/oe-kbuild-all/202312052245.yFpBSgNH-lkp@intel.com/">https://lore.kernel.org/oe-kbuild-all/202312052245.yFpBSgNH-lkp@intel.com/</a>:<br>
<br>
<blockquote type="cite">
<blockquote type="cite">drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_process.c:1671:9:
sparse: sparse: incompatible types in comparison expression
(different address spaces): >>
drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_process.c:1671:9:
sparse: </blockquote>
</blockquote>
struct dma_fence [noderef] __rcu * >>
drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_process.c:1671:9: sparse:
struct dma_fence * ... >>
drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c:2765:36: sparse:
sparse: incompatible types in comparison expression (different
address spaces): >>
drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c:2765:36: sparse:
struct dma_fence [noderef] __rcu * >>
drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c:2765:36: sparse:
struct dma_fence * >>
drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c:2765:36: sparse:
sparse: incompatible types in comparison expression (different
address spaces): >>
drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c:2765:36: sparse:
struct dma_fence [noderef] __rcu * >>
drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c:2765:36: sparse:
struct dma_fence *
<br>
<br>
As far as I can tell, the reason is, that I'm using
dma_fence_get_rcu_safe and rcu_replace_pointer to get and update
kfd_process->ef, without annotating the fence pointers with
__rcu. This patch fixes that by marking kfd_process->ef as an
__rcu pointer. The only place that was dereferencing it directly
was kfd_process_wq_release, where I added
rcu_dereference_protected. The condition I'm using here is, that
the process ref count is 0 at that point, which means nobody else
is referencing the process or this fence pointer at the time.
<br>
</blockquote>
<p>Hi Christian,</p>
<p>We discussed offline that you think rcu_dereference_protected is
not needed in the teardown function. After reading over
rcupdate.h, I think a simpler alternative would be to use
rcu_access_pointer, after a grace period to ensure there can be no
more readers. Based on this comment in rcupdate.h:</p>
<pre> * It is also permissible to use rcu_access_pointer() when read-side
* access to the pointer was removed at least one grace period ago, as is
* the case in the context of the RCU callback that is freeing up the data,
* or after a synchronize_rcu() returns. This can be useful when tearing
* down multi-linked structures after a grace period has elapsed. However,
* rcu_dereference_protected() is normally preferred for this use case.
</pre>
<p>The last sentence sounds like rcu_dereference_protected should
also be OK, though. Either way, it sounds like I need to add a
synchronize_rcu call in any case, before freeing the fence. Do you
agree with this proposal?</p>
<p>Regards,<br>
Felix<br>
</p>
<p><br>
</p>
<blockquote type="cite" cite="mid:7b67d4f1-cfcd-47a9-a80e-f4c1eee235a1@amd.com">
<br>
Regards,
<br>
Felix
<br>
<br>
<br>
<blockquote type="cite">
<br>
Regards,
<br>
Christian.
<br>
<br>
<blockquote type="cite">
<br>
Thanks,
<br>
Felix
<br>
<br>
<br>
<br>
<blockquote type="cite">---
<br>
drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h | 2 +-
<br>
drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 4 ++--
<br>
drivers/gpu/drm/amd/amdkfd/kfd_priv.h | 2 +-
<br>
drivers/gpu/drm/amd/amdkfd/kfd_process.c | 6
++++--
<br>
4 files changed, 8 insertions(+), 6 deletions(-)
<br>
<br>
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
<br>
index f2e920734c98..20cb266dcedd 100644
<br>
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
<br>
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
<br>
@@ -314,7 +314,7 @@ void
amdgpu_amdkfd_gpuvm_unmap_gtt_bo_from_kernel(struct kgd_mem
*mem);
<br>
int amdgpu_amdkfd_map_gtt_bo_to_gart(struct amdgpu_device
*adev, struct amdgpu_bo *bo);
<br>
int amdgpu_amdkfd_gpuvm_restore_process_bos(void
*process_info,
<br>
- struct dma_fence **ef);
<br>
+ struct dma_fence __rcu **ef);
<br>
int amdgpu_amdkfd_gpuvm_get_vm_fault_info(struct
amdgpu_device *adev,
<br>
struct kfd_vm_fault_info *info);
<br>
int amdgpu_amdkfd_gpuvm_import_dmabuf_fd(struct
amdgpu_device *adev, int fd,
<br>
diff --git
a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
<br>
index 7d91f99acb59..8ba6f6c8363d 100644
<br>
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
<br>
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
<br>
@@ -2806,7 +2806,7 @@ static void
amdgpu_amdkfd_restore_userptr_worker(struct work_struct
*work)
<br>
put_task_struct(usertask);
<br>
}
<br>
-static void replace_eviction_fence(struct dma_fence **ef,
<br>
+static void replace_eviction_fence(struct dma_fence __rcu
**ef,
<br>
struct dma_fence *new_ef)
<br>
{
<br>
struct dma_fence *old_ef = rcu_replace_pointer(*ef,
new_ef, true
<br>
@@ -2841,7 +2841,7 @@ static void
replace_eviction_fence(struct dma_fence **ef,
<br>
* 7. Add fence to all PD and PT BOs.
<br>
* 8. Unreserve all BOs
<br>
*/
<br>
-int amdgpu_amdkfd_gpuvm_restore_process_bos(void *info,
struct dma_fence **ef)
<br>
+int amdgpu_amdkfd_gpuvm_restore_process_bos(void *info,
struct dma_fence __rcu **ef)
<br>
{
<br>
struct amdkfd_process_info *process_info = info;
<br>
struct amdgpu_vm *peer_vm;
<br>
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
<br>
index 45366b4ca976..5a24097a9f28 100644
<br>
--- a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
<br>
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
<br>
@@ -917,7 +917,7 @@ struct kfd_process {
<br>
* fence will be triggered during eviction and new one
will be created
<br>
* during restore
<br>
*/
<br>
- struct dma_fence *ef;
<br>
+ struct dma_fence __rcu *ef;
<br>
/* Work items for evicting and restoring BOs */
<br>
struct delayed_work eviction_work;
<br>
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process.c
b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
<br>
index 71df51fcc1b0..14b11d61f8dd 100644
<br>
--- a/drivers/gpu/drm/amd/amdkfd/kfd_process.c
<br>
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
<br>
@@ -1110,6 +1110,8 @@ static void
kfd_process_wq_release(struct work_struct *work)
<br>
{
<br>
struct kfd_process *p = container_of(work, struct
kfd_process,
<br>
release_work);
<br>
+ struct dma_fence *ef =
rcu_dereference_protected(p->ef,
<br>
+ kref_read(&p->ref) == 0);
<br>
kfd_process_dequeue_from_all_devices(p);
<br>
pqm_uninit(&p->pqm);
<br>
@@ -1118,7 +1120,7 @@ static void
kfd_process_wq_release(struct work_struct *work)
<br>
* destroyed. This allows any BOs to be freed without
<br>
* triggering pointless evictions or waiting for
fences.
<br>
*/
<br>
- dma_fence_signal(p->ef);
<br>
+ dma_fence_signal(ef);
<br>
kfd_process_remove_sysfs(p);
<br>
@@ -1127,7 +1129,7 @@ static void
kfd_process_wq_release(struct work_struct *work)
<br>
svm_range_list_fini(p);
<br>
kfd_process_destroy_pdds(p);
<br>
- dma_fence_put(p->ef);
<br>
+ dma_fence_put(ef);
<br>
kfd_event_free_process(p);
<br>
</blockquote>
</blockquote>
<br>
</blockquote>
<br>
</blockquote>
</body>
</html>