<html><head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
  </head>
  <body>
    <p><br>
    </p>
    <div class="moz-cite-prefix">On 2021-08-04 5:04 a.m., Christian
      König wrote:<br>
    </div>
    <blockquote type="cite" cite="mid:c24b471f-6b4b-8301-b7d6-bba69f6467ab@gmail.com">Sorry
      I'm on vacation and can't reply immediately.
      <br>
      <br>
      This is the wrong approach. The fdinfo should have grabbed a
      reference to the fd it prints the info for.
      <br>
      <br>
      So we should never race here. Can you double check how this
      happens?
      <br>
    </blockquote>
    <p>This backtrace happened once, from
      /var/crash/..$date../vmcode-dmesg.log on the server machine, I can
      not repro the issue, grep app folder, there are python scripts
      accessing /proc/pid/node_id/fdinfo. This happened after app crash
      segmentation fault killed.<br>
    </p>
    <p>fdinfo grab fpriv reference, but not fpriv->vm.root.bo
      reference, I think this is needed, otherwise
      amdgpu_bo_reserve(fpriv->vm.root.bo) may deference NULL
      pointer.</p>
    <p>Regards,</p>
    <p>Philip<br>
    </p>
    <blockquote type="cite" cite="mid:c24b471f-6b4b-8301-b7d6-bba69f6467ab@gmail.com">Thanks,
      <br>
      Christian.
      <br>
      <br>
      Am 03.08.21 um 16:06 schrieb philip yang:
      <br>
      <blockquote type="cite">
        <br>
        ping?
        <br>
        <br>
        On 2021-07-29 10:13 p.m., Philip Yang wrote:
        <br>
        <blockquote type="cite">Get process vm root BO ref in case
          process is exiting and root BO is
          <br>
          freed, to avoid NULL pointer dereference backtrace:
          <br>
          <br>
          BUG: unable to handle kernel NULL pointer dereference at
          <br>
          0000000000000000
          <br>
          Call Trace:
          <br>
          amdgpu_show_fdinfo+0xfe/0x2a0 [amdgpu]
          <br>
          seq_show+0x12c/0x180
          <br>
          seq_read+0x153/0x410
          <br>
          vfs_read+0x91/0x140[ 3427.206183]  ksys_read+0x4f/0xb0
          <br>
          do_syscall_64+0x5b/0x1a0
          <br>
          entry_SYSCALL_64_after_hwframe+0x65/0xca
          <br>
          <br>
          v2: rebase to staging
          <br>
          <br>
          Signed-off-by: Philip Yang<a class="moz-txt-link-rfc2396E" href="mailto:Philip.Yang@amd.com"><Philip.Yang@amd.com></a>
          <br>
          ---
          <br>
            drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c | 11 +++++++++--
          <br>
            1 file changed, 9 insertions(+), 2 deletions(-)
          <br>
          <br>
          diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c
          b/drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c
          <br>
          index d94c5419ec25..5a6857c44bb6 100644
          <br>
          --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c
          <br>
          +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c
          <br>
          @@ -59,6 +59,7 @@ void amdgpu_show_fdinfo(struct seq_file *m,
          struct file *f)
          <br>
                uint64_t vram_mem = 0, gtt_mem = 0, cpu_mem = 0;
          <br>
                struct drm_file *file = f->private_data;
          <br>
                struct amdgpu_device *adev =
          drm_to_adev(file->minor->dev);
          <br>
          +    struct amdgpu_bo *root;
          <br>
                int ret;
          <br>
                  ret = amdgpu_file_to_fpriv(f, &fpriv);
          <br>
          @@ -69,13 +70,19 @@ void amdgpu_show_fdinfo(struct seq_file
          *m, struct file *f)
          <br>
                dev = PCI_SLOT(adev->pdev->devfn);
          <br>
                fn = PCI_FUNC(adev->pdev->devfn);
          <br>
            -    ret = amdgpu_bo_reserve(fpriv->vm.root.bo, false);
          <br>
          +    root = amdgpu_bo_ref(fpriv->vm.root.bo);
          <br>
          +    if (!root)
          <br>
          +        return;
          <br>
          +
          <br>
          +    ret = amdgpu_bo_reserve(root, false);
          <br>
                if (ret) {
          <br>
                    DRM_ERROR("Fail to reserve bo\n");
          <br>
                    return;
          <br>
                }
          <br>
                amdgpu_vm_get_memory(&fpriv->vm, &vram_mem,
          &gtt_mem, &cpu_mem);
          <br>
          -    amdgpu_bo_unreserve(fpriv->vm.root.bo);
          <br>
          +    amdgpu_bo_unreserve(root);
          <br>
          +    amdgpu_bo_unref(&root);
          <br>
          +
          <br>
                seq_printf(m, "pdev:\t%04x:%02x:%02x.%d\npasid:\t%u\n",
          domain, bus,
          <br>
                        dev, fn, fpriv->vm.pasid);
          <br>
                seq_printf(m, "vram mem:\t%llu kB\n", vram_mem/1024UL);
          <br>
        </blockquote>
      </blockquote>
      <br>
    </blockquote>
  </body>
</html>