<!DOCTYPE html><html><head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
</head>
<body>
<p><br>
</p>
<div class="moz-cite-prefix">On 12/11/2024 11:30 PM, Zhu Lingshan
wrote:<br>
</div>
<blockquote type="cite" cite="mid:40a93b94-dbb4-4d30-9ba8-0b0185e1fc1b@amd.com">
<pre wrap="" class="moz-quote-pre">On 12/12/2024 12:19 PM, Felix Kuehling wrote:
</pre>
<blockquote type="cite">
<pre wrap="" class="moz-quote-pre">
On 2024-12-11 22:06, Zhu Lingshan wrote:
</pre>
<blockquote type="cite">
<pre wrap="" class="moz-quote-pre">kfd_process_wq_release() signals eviction fence by
dma_fence_signal() which wanrs if dma_fence
is NULL.
</pre>
</blockquote>
<pre wrap="" class="moz-quote-pre">That's news to me. Looking at the dma_fence_signal implementation on amd-staging-drm-next, it just silently returns -EINVAL if the fence pointer is NULL. I see the same in Linux 6.12.4: <a class="moz-txt-link-freetext" href="https://elixir.bootlin.com/linux/v6.12.4/source/drivers/dma-buf/dma-fence.c#L467">https://elixir.bootlin.com/linux/v6.12.4/source/drivers/dma-buf/dma-fence.c#L467</a>
Which branch are you on?
</pre>
</blockquote>
<pre wrap="" class="moz-quote-pre">Linus tree, latest master branch, tag v6.13-rc2
<a class="moz-txt-link-freetext" href="https://github.com/torvalds/linux/blob/master/drivers/dma-buf/dma-fence.c#L467">https://github.com/torvalds/linux/blob/master/drivers/dma-buf/dma-fence.c#L467</a>
which is introduced by
<a class="moz-txt-link-freetext" href="https://github.com/torvalds/linux/commit/967d226eaae8e40636d257bf8ae55d2c5a912f58">https://github.com/torvalds/linux/commit/967d226eaae8e40636d257bf8ae55d2c5a912f58</a>
</pre>
</blockquote>
<p>It is new. I did not see it from AMD kernel either. </p>
<p>Previously I wanted put following dma_fence_put(ef) together with
<span style="white-space: pre-wrap">dma_fence_signal(ef) :</span></p>
<pre wrap="" class="moz-quote-pre">+ if (ef) {
+ dma_fence_signal(ef);
+ dma_fence_put(ef)
+ }
That seems neater.
Regards
Xiaogang
</pre>
<p></p>
<blockquote type="cite" cite="mid:40a93b94-dbb4-4d30-9ba8-0b0185e1fc1b@amd.com">
<pre wrap="" class="moz-quote-pre">
Thanks
Lingshan
</pre>
<blockquote type="cite">
<pre wrap="" class="moz-quote-pre">
Regards,
Felix
</pre>
<blockquote type="cite">
<pre wrap="" class="moz-quote-pre">kfd_process->ef is initialized by kfd_process_device_init_vm()
through ioctl. That means the fence is NULL for a new
created kfd_process, and close a kfd_process right
after open it will trigger the warning.
This commit conditionally signals the eviction fence
in kfd_process_wq_release() only when it is available.
[ 503.660882] WARNING: CPU: 0 PID: 9 at drivers/dma-buf/dma-fence.c:467 dma_fence_signal+0x74/0xa0
[ 503.782940] Workqueue: kfd_process_wq kfd_process_wq_release [amdgpu]
[ 503.789640] RIP: 0010:dma_fence_signal+0x74/0xa0
[ 503.877620] Call Trace:
[ 503.880066] <TASK>
[ 503.882168] ? __warn+0xcd/0x260
[ 503.885407] ? dma_fence_signal+0x74/0xa0
[ 503.889416] ? report_bug+0x288/0x2d0
[ 503.893089] ? handle_bug+0x53/0xa0
[ 503.896587] ? exc_invalid_op+0x14/0x50
[ 503.900424] ? asm_exc_invalid_op+0x16/0x20
[ 503.904616] ? dma_fence_signal+0x74/0xa0
[ 503.908626] kfd_process_wq_release+0x6b/0x370 [amdgpu]
[ 503.914081] process_one_work+0x654/0x10a0
[ 503.918186] worker_thread+0x6c3/0xe70
[ 503.921943] ? srso_alias_return_thunk+0x5/0xfbef5
[ 503.926735] ? srso_alias_return_thunk+0x5/0xfbef5
[ 503.931527] ? __kthread_parkme+0x82/0x140
[ 503.935631] ? __pfx_worker_thread+0x10/0x10
[ 503.939904] kthread+0x2a8/0x380
[ 503.943132] ? __pfx_kthread+0x10/0x10
[ 503.946882] ret_from_fork+0x2d/0x70
[ 503.950458] ? __pfx_kthread+0x10/0x10
[ 503.954210] ret_from_fork_asm+0x1a/0x30
[ 503.958142] </TASK>
[ 503.960328] ---[ end trace 0000000000000000 ]---
Signed-off-by: Zhu Lingshan <a class="moz-txt-link-rfc2396E" href="mailto:lingshan.zhu@amd.com"><lingshan.zhu@amd.com></a>
---
drivers/gpu/drm/amd/amdkfd/kfd_process.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process.c b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
index 87cd52cf4ee9..47d36f43ee8c 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_process.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
@@ -1159,7 +1159,8 @@ static void kfd_process_wq_release(struct work_struct *work)
*/
synchronize_rcu();
ef = rcu_access_pointer(p->ef);
- dma_fence_signal(ef);
+ if (ef)
+ dma_fence_signal(ef);
kfd_process_remove_sysfs(p);
</pre>
</blockquote>
</blockquote>
<pre wrap="" class="moz-quote-pre">
</pre>
</blockquote>
</body>
</html>