<html><head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
</head>
<body>
<p><br>
</p>
<div class="moz-cite-prefix">On 2021-08-04 5:04 a.m., Christian
König wrote:<br>
</div>
<blockquote type="cite" cite="mid:c24b471f-6b4b-8301-b7d6-bba69f6467ab@gmail.com">Sorry
I'm on vacation and can't reply immediately.
<br>
<br>
This is the wrong approach. The fdinfo should have grabbed a
reference to the fd it prints the info for.
<br>
<br>
So we should never race here. Can you double check how this
happens?
<br>
</blockquote>
<p>This backtrace happened once, from
/var/crash/..$date../vmcode-dmesg.log on the server machine, I can
not repro the issue, grep app folder, there are python scripts
accessing /proc/pid/node_id/fdinfo. This happened after app crash
segmentation fault killed.<br>
</p>
<p>fdinfo grab fpriv reference, but not fpriv->vm.root.bo
reference, I think this is needed, otherwise
amdgpu_bo_reserve(fpriv->vm.root.bo) may deference NULL
pointer.</p>
<p>Regards,</p>
<p>Philip<br>
</p>
<blockquote type="cite" cite="mid:c24b471f-6b4b-8301-b7d6-bba69f6467ab@gmail.com">Thanks,
<br>
Christian.
<br>
<br>
Am 03.08.21 um 16:06 schrieb philip yang:
<br>
<blockquote type="cite">
<br>
ping?
<br>
<br>
On 2021-07-29 10:13 p.m., Philip Yang wrote:
<br>
<blockquote type="cite">Get process vm root BO ref in case
process is exiting and root BO is
<br>
freed, to avoid NULL pointer dereference backtrace:
<br>
<br>
BUG: unable to handle kernel NULL pointer dereference at
<br>
0000000000000000
<br>
Call Trace:
<br>
amdgpu_show_fdinfo+0xfe/0x2a0 [amdgpu]
<br>
seq_show+0x12c/0x180
<br>
seq_read+0x153/0x410
<br>
vfs_read+0x91/0x140[ 3427.206183] ksys_read+0x4f/0xb0
<br>
do_syscall_64+0x5b/0x1a0
<br>
entry_SYSCALL_64_after_hwframe+0x65/0xca
<br>
<br>
v2: rebase to staging
<br>
<br>
Signed-off-by: Philip Yang<a class="moz-txt-link-rfc2396E" href="mailto:Philip.Yang@amd.com"><Philip.Yang@amd.com></a>
<br>
---
<br>
drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c | 11 +++++++++--
<br>
1 file changed, 9 insertions(+), 2 deletions(-)
<br>
<br>
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c
b/drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c
<br>
index d94c5419ec25..5a6857c44bb6 100644
<br>
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c
<br>
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c
<br>
@@ -59,6 +59,7 @@ void amdgpu_show_fdinfo(struct seq_file *m,
struct file *f)
<br>
uint64_t vram_mem = 0, gtt_mem = 0, cpu_mem = 0;
<br>
struct drm_file *file = f->private_data;
<br>
struct amdgpu_device *adev =
drm_to_adev(file->minor->dev);
<br>
+ struct amdgpu_bo *root;
<br>
int ret;
<br>
ret = amdgpu_file_to_fpriv(f, &fpriv);
<br>
@@ -69,13 +70,19 @@ void amdgpu_show_fdinfo(struct seq_file
*m, struct file *f)
<br>
dev = PCI_SLOT(adev->pdev->devfn);
<br>
fn = PCI_FUNC(adev->pdev->devfn);
<br>
- ret = amdgpu_bo_reserve(fpriv->vm.root.bo, false);
<br>
+ root = amdgpu_bo_ref(fpriv->vm.root.bo);
<br>
+ if (!root)
<br>
+ return;
<br>
+
<br>
+ ret = amdgpu_bo_reserve(root, false);
<br>
if (ret) {
<br>
DRM_ERROR("Fail to reserve bo\n");
<br>
return;
<br>
}
<br>
amdgpu_vm_get_memory(&fpriv->vm, &vram_mem,
>t_mem, &cpu_mem);
<br>
- amdgpu_bo_unreserve(fpriv->vm.root.bo);
<br>
+ amdgpu_bo_unreserve(root);
<br>
+ amdgpu_bo_unref(&root);
<br>
+
<br>
seq_printf(m, "pdev:\t%04x:%02x:%02x.%d\npasid:\t%u\n",
domain, bus,
<br>
dev, fn, fpriv->vm.pasid);
<br>
seq_printf(m, "vram mem:\t%llu kB\n", vram_mem/1024UL);
<br>
</blockquote>
</blockquote>
<br>
</blockquote>
</body>
</html>