<html><head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
  </head>
  <body text="#000000" bgcolor="#FFFFFF">
    <div class="moz-cite-prefix">Yeah, that is indeed partially true.<br>
      <br>
      But we already have the same logic in amdgpu_vm_bo_base_init() and
      amdgpu_vm_validate_pt_bos(). Just in the functions
      amdgpu_vm_invalidate_pds() and amdgpu_vm_bo_invalidate() the
      handling seems to be incorrect.<br>
      <br>
      Still sounds like a good idea to me to have this logic in a common
      place and not duplicated multiple times.<br>
      <br>
      And the function name is still correct if we think about it as a
      state of the bo_va instead of a helper to put thinks on a list.
      It's just that for the root PD we can skip this state change and
      go directly to the idle state.<br>
      <br>
      Regards,<br>
      Christian.<br>
      <br>
      Am 10.02.20 um 01:59 schrieb Pan, Xinhui:<br>
    </div>
    <blockquote type="cite" cite="mid:SN6PR12MB28004896C5D08FC3F4BBE2A787190@SN6PR12MB2800.namprd12.prod.outlook.com">
      
      <p style="font-family:Arial;font-size:10pt;color:#0078D7;margin:15pt;" align="Left">
        [AMD Official Use Only - Internal Distribution Only]<br>
      </p>
      <br>
      <div>
        <div>
          <meta content="text/html; charset=us-ascii">
        </div>
        <div dir="auto" style="direction:ltr; margin:0; padding:0;
          font-family:sans-serif; font-size:11pt; color:black">
          If so the function name does not match its functionality. </div>
        <div id="ms-outlook-mobile-signature" dir="auto" style="text-align: left;">
          <div><br>
          </div>
        </div>
        <hr tabindex="-1" style="display:inline-block; width:98%">
        <div id="divRplyFwdMsg" dir="ltr"><font style="font-size:11pt" face="Calibri, sans-serif" color="#000000"><b>From:</b>
            Christian König <a class="moz-txt-link-rfc2396E" href="mailto:ckoenig.leichtzumerken@gmail.com"><ckoenig.leichtzumerken@gmail.com></a><br>
            <b>Sent:</b> Sunday, February 9, 2020 4:21:13 PM<br>
            <b>To:</b> Pan, Xinhui <a class="moz-txt-link-rfc2396E" href="mailto:Xinhui.Pan@amd.com"><Xinhui.Pan@amd.com></a>;
            <a class="moz-txt-link-abbreviated" href="mailto:amd-gfx@lists.freedesktop.org">amd-gfx@lists.freedesktop.org</a>
            <a class="moz-txt-link-rfc2396E" href="mailto:amd-gfx@lists.freedesktop.org"><amd-gfx@lists.freedesktop.org></a><br>
            <b>Cc:</b> Deucher, Alexander
            <a class="moz-txt-link-rfc2396E" href="mailto:Alexander.Deucher@amd.com"><Alexander.Deucher@amd.com></a>; Koenig, Christian
            <a class="moz-txt-link-rfc2396E" href="mailto:Christian.Koenig@amd.com"><Christian.Koenig@amd.com></a><br>
            <b>Subject:</b> Re: [PATCH] drm/amdgpu: Do not move root PT
            bo to relocated list</font>
          <div> </div>
        </div>
        <div class="BodyFragment"><font size="2"><span style="font-size:11pt">
              <div class="PlainText">Am 09.02.20 um 03:52 schrieb Pan,
                Xinhui:<br>
                > hit panic when we update the page tables.<br>
                ><br>
                > <1>[  122.103290] BUG: kernel NULL pointer
                dereference, address: 0000000000000008<br>
                > <1>[  122.103348] #PF: supervisor read access
                in kernel mode<br>
                > <1>[  122.103376] #PF: error_code(0x0000) -
                not-present page<br>
                > <6>[  122.103403] PGD 0 P4D 0<br>
                > <4>[  122.103421] Oops: 0000 [#1] SMP PTI<br>
                > <4>[  122.103442] CPU: 13 PID: 2133 Comm:
                kfdtest Tainted: G           OE     5.4.0-rc7+ #7<br>
                > <4>[  122.103480] Hardware name: Supermicro
                SYS-7048GR-TR/X10DRG-Q, BIOS 3.0b 03/09/2018<br>
                > <4>[  122.103657] RIP:
                0010:amdgpu_vm_update_pdes+0x140/0x330 [amdgpu]<br>
                > <4>[  122.103689] Code: 03 4c 89 73 08 49 89
                9d c8 00 00 00 48 8b 7b f0 c6 43 10 00 45 31 c0 48 8b 87
                28 04 00 00 48 85 c0 74 07 4c 8b 80 20 04 00 00
                <4d> 8b 70 08 31 f6 49 8b 86 28 04 00 00 48 85 c0
                74 0f 48 8b 80 28<br>
                > <4>[  122.103769] RSP: 0018:ffffb49a0a6a3a98
                EFLAGS: 00010246<br>
                > <4>[  122.103797] RAX: 0000000000000000 RBX:
                ffff9020f823c148 RCX: dead000000000122<br>
                > <4>[  122.103831] RDX: ffff9020ece70018 RSI:
                ffff9020f823c0c8 RDI: ffff9010ca31c800<br>
                > <4>[  122.103865] RBP: ffffb49a0a6a3b38 R08:
                0000000000000000 R09: 0000000000000001<br>
                > <4>[  122.103899] R10: 000000006044f994 R11:
                00000000df57fb58 R12: ffff9020f823c000<br>
                > <4>[  122.103933] R13: ffff9020f823c000 R14:
                ffff9020f823c0c8 R15: ffff9010d5d20000<br>
                > <4>[  122.103968] FS:  00007f32c83dc780(0000)
                GS:ffff9020ff380000(0000) knlGS:0000000000000000<br>
                > <4>[  122.104006] CS:  0010 DS: 0000 ES: 0000
                CR0: 0000000080050033<br>
                > <4>[  122.104035] CR2: 0000000000000008 CR3:
                0000002036bba005 CR4: 00000000003606e0<br>
                > <4>[  122.104069] DR0: 0000000000000000 DR1:
                0000000000000000 DR2: 0000000000000000<br>
                > <4>[  122.104103] DR3: 0000000000000000 DR6:
                00000000fffe0ff0 DR7: 0000000000000400<br>
                > <4>[  122.104137] Call Trace:<br>
                > <4>[  122.104241]  vm_update_pds+0x31/0x50
                [amdgpu]<br>
                > <4>[  122.104347] 
                amdgpu_amdkfd_gpuvm_map_memory_to_gpu+0x2ef/0x690
                [amdgpu]<br>
                > <4>[  122.104466] 
                kfd_process_alloc_gpuvm+0x98/0x190 [amdgpu]<br>
                > <4>[  122.104576] 
                kfd_process_device_init_vm.part.8+0xf3/0x1f0 [amdgpu]<br>
                > <4>[  122.104688] 
                kfd_process_device_init_vm+0x24/0x30 [amdgpu]<br>
                > <4>[  122.104794] 
                kfd_ioctl_acquire_vm+0xa4/0xc0 [amdgpu]<br>
                > <4>[  122.104900]  kfd_ioctl+0x277/0x500
                [amdgpu]<br>
                > <4>[  122.105001]  ?
                kfd_ioctl_free_memory_of_gpu+0xc0/0xc0 [amdgpu]<br>
                > <4>[  122.105039]  ?
                rcu_read_lock_sched_held+0x4f/0x80<br>
                > <4>[  122.105068]  ?
                kmem_cache_free+0x2ba/0x300<br>
                > <4>[  122.105093]  ? vm_area_free+0x18/0x20<br>
                > <4>[  122.105117]  ? find_held_lock+0x35/0xa0<br>
                > <4>[  122.105143]  do_vfs_ioctl+0xa9/0x6f0<br>
                > <4>[  122.106001]  ksys_ioctl+0x75/0x80<br>
                > <4>[  122.106802]  ? do_syscall_64+0x17/0x230<br>
                > <4>[  122.107605]  __x64_sys_ioctl+0x1a/0x20<br>
                > <4>[  122.108378]  do_syscall_64+0x5f/0x230<br>
                > <4>[  122.109118] 
                entry_SYSCALL_64_after_hwframe+0x49/0xbe<br>
                > <4>[  122.109842] RIP: 0033:0x7f32c6b495d7<br>
                ><br>
                > Signed-off-by: xinhui pan
                <a class="moz-txt-link-rfc2396E" href="mailto:xinhui.pan@amd.com"><xinhui.pan@amd.com></a><br>
                > ---<br>
                >   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 2 +-<br>
                >   1 file changed, 1 insertion(+), 1 deletion(-)<br>
                ><br>
                > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
                b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c<br>
                > index 3195bc90985a..3c388fdf335c 100644<br>
                > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c<br>
                > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c<br>
                > @@ -2619,7 +2619,7 @@ void
                amdgpu_vm_bo_invalidate(struct amdgpu_device *adev,<br>
                >                        continue;<br>
                >                bo_base->moved = true;<br>
                >   <br>
                > -             if (bo->tbo.type ==
                ttm_bo_type_kernel)<br>
                > +             if (bo->tbo.type ==
                ttm_bo_type_kernel && bo->parent)<br>
                <br>
                Good catch, but that would mean that we move the root PD
                to the moved <br>
                state which in turn is illegal as well.<br>
                <br>
                Maybe better adjust amdgpu_vm_bo_relocated() to move the
                root PD to the <br>
                idle state instead.<br>
                <br>
                Christian.<br>
                <br>
                <br>
                >                       
                amdgpu_vm_bo_relocated(bo_base);<br>
                >                else if (bo->tbo.base.resv ==
                vm->root.base.bo->tbo.base.resv)<br>
                >                        amdgpu_vm_bo_moved(bo_base);<br>
                <br>
              </div>
            </span></font></div>
        <hr style="display:inline-block;width:98%" tabindex="-1">
        <div id="divRplyFwdMsg" dir="ltr"><font style="font-size:11pt" face="Calibri, sans-serif" color="#000000"><b>From:</b>
            Christian König <a class="moz-txt-link-rfc2396E" href="mailto:ckoenig.leichtzumerken@gmail.com"><ckoenig.leichtzumerken@gmail.com></a><br>
            <b>Sent:</b> Sunday, February 9, 2020 4:21:13 PM<br>
            <b>To:</b> Pan, Xinhui <a class="moz-txt-link-rfc2396E" href="mailto:Xinhui.Pan@amd.com"><Xinhui.Pan@amd.com></a>;
            <a class="moz-txt-link-abbreviated" href="mailto:amd-gfx@lists.freedesktop.org">amd-gfx@lists.freedesktop.org</a>
            <a class="moz-txt-link-rfc2396E" href="mailto:amd-gfx@lists.freedesktop.org"><amd-gfx@lists.freedesktop.org></a><br>
            <b>Cc:</b> Deucher, Alexander
            <a class="moz-txt-link-rfc2396E" href="mailto:Alexander.Deucher@amd.com"><Alexander.Deucher@amd.com></a>; Koenig, Christian
            <a class="moz-txt-link-rfc2396E" href="mailto:Christian.Koenig@amd.com"><Christian.Koenig@amd.com></a><br>
            <b>Subject:</b> Re: [PATCH] drm/amdgpu: Do not move root PT
            bo to relocated list</font>
          <div> </div>
        </div>
        <div class="BodyFragment"><font size="2"><span style="font-size:11pt;">
              <div class="PlainText">Am 09.02.20 um 03:52 schrieb Pan,
                Xinhui:<br>
                > hit panic when we update the page tables.<br>
                ><br>
                > <1>[  122.103290] BUG: kernel NULL pointer
                dereference, address: 0000000000000008<br>
                > <1>[  122.103348] #PF: supervisor read access
                in kernel mode<br>
                > <1>[  122.103376] #PF: error_code(0x0000) -
                not-present page<br>
                > <6>[  122.103403] PGD 0 P4D 0<br>
                > <4>[  122.103421] Oops: 0000 [#1] SMP PTI<br>
                > <4>[  122.103442] CPU: 13 PID: 2133 Comm:
                kfdtest Tainted: G           OE     5.4.0-rc7+ #7<br>
                > <4>[  122.103480] Hardware name: Supermicro
                SYS-7048GR-TR/X10DRG-Q, BIOS 3.0b 03/09/2018<br>
                > <4>[  122.103657] RIP:
                0010:amdgpu_vm_update_pdes+0x140/0x330 [amdgpu]<br>
                > <4>[  122.103689] Code: 03 4c 89 73 08 49 89
                9d c8 00 00 00 48 8b 7b f0 c6 43 10 00 45 31 c0 48 8b 87
                28 04 00 00 48 85 c0 74 07 4c 8b 80 20 04 00 00
                <4d> 8b 70 08 31 f6 49 8b 86 28 04 00 00 48 85 c0
                74 0f 48 8b 80 28<br>
                > <4>[  122.103769] RSP: 0018:ffffb49a0a6a3a98
                EFLAGS: 00010246<br>
                > <4>[  122.103797] RAX: 0000000000000000 RBX:
                ffff9020f823c148 RCX: dead000000000122<br>
                > <4>[  122.103831] RDX: ffff9020ece70018 RSI:
                ffff9020f823c0c8 RDI: ffff9010ca31c800<br>
                > <4>[  122.103865] RBP: ffffb49a0a6a3b38 R08:
                0000000000000000 R09: 0000000000000001<br>
                > <4>[  122.103899] R10: 000000006044f994 R11:
                00000000df57fb58 R12: ffff9020f823c000<br>
                > <4>[  122.103933] R13: ffff9020f823c000 R14:
                ffff9020f823c0c8 R15: ffff9010d5d20000<br>
                > <4>[  122.103968] FS:  00007f32c83dc780(0000)
                GS:ffff9020ff380000(0000) knlGS:0000000000000000<br>
                > <4>[  122.104006] CS:  0010 DS: 0000 ES: 0000
                CR0: 0000000080050033<br>
                > <4>[  122.104035] CR2: 0000000000000008 CR3:
                0000002036bba005 CR4: 00000000003606e0<br>
                > <4>[  122.104069] DR0: 0000000000000000 DR1:
                0000000000000000 DR2: 0000000000000000<br>
                > <4>[  122.104103] DR3: 0000000000000000 DR6:
                00000000fffe0ff0 DR7: 0000000000000400<br>
                > <4>[  122.104137] Call Trace:<br>
                > <4>[  122.104241]  vm_update_pds+0x31/0x50
                [amdgpu]<br>
                > <4>[  122.104347] 
                amdgpu_amdkfd_gpuvm_map_memory_to_gpu+0x2ef/0x690
                [amdgpu]<br>
                > <4>[  122.104466] 
                kfd_process_alloc_gpuvm+0x98/0x190 [amdgpu]<br>
                > <4>[  122.104576] 
                kfd_process_device_init_vm.part.8+0xf3/0x1f0 [amdgpu]<br>
                > <4>[  122.104688] 
                kfd_process_device_init_vm+0x24/0x30 [amdgpu]<br>
                > <4>[  122.104794] 
                kfd_ioctl_acquire_vm+0xa4/0xc0 [amdgpu]<br>
                > <4>[  122.104900]  kfd_ioctl+0x277/0x500
                [amdgpu]<br>
                > <4>[  122.105001]  ?
                kfd_ioctl_free_memory_of_gpu+0xc0/0xc0 [amdgpu]<br>
                > <4>[  122.105039]  ?
                rcu_read_lock_sched_held+0x4f/0x80<br>
                > <4>[  122.105068]  ?
                kmem_cache_free+0x2ba/0x300<br>
                > <4>[  122.105093]  ? vm_area_free+0x18/0x20<br>
                > <4>[  122.105117]  ? find_held_lock+0x35/0xa0<br>
                > <4>[  122.105143]  do_vfs_ioctl+0xa9/0x6f0<br>
                > <4>[  122.106001]  ksys_ioctl+0x75/0x80<br>
                > <4>[  122.106802]  ? do_syscall_64+0x17/0x230<br>
                > <4>[  122.107605]  __x64_sys_ioctl+0x1a/0x20<br>
                > <4>[  122.108378]  do_syscall_64+0x5f/0x230<br>
                > <4>[  122.109118] 
                entry_SYSCALL_64_after_hwframe+0x49/0xbe<br>
                > <4>[  122.109842] RIP: 0033:0x7f32c6b495d7<br>
                ><br>
                > Signed-off-by: xinhui pan
                <a class="moz-txt-link-rfc2396E" href="mailto:xinhui.pan@amd.com"><xinhui.pan@amd.com></a><br>
                > ---<br>
                >   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 2 +-<br>
                >   1 file changed, 1 insertion(+), 1 deletion(-)<br>
                ><br>
                > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
                b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c<br>
                > index 3195bc90985a..3c388fdf335c 100644<br>
                > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c<br>
                > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c<br>
                > @@ -2619,7 +2619,7 @@ void
                amdgpu_vm_bo_invalidate(struct amdgpu_device *adev,<br>
                >                        continue;<br>
                >                bo_base->moved = true;<br>
                >   <br>
                > -             if (bo->tbo.type ==
                ttm_bo_type_kernel)<br>
                > +             if (bo->tbo.type ==
                ttm_bo_type_kernel && bo->parent)<br>
                <br>
                Good catch, but that would mean that we move the root PD
                to the moved <br>
                state which in turn is illegal as well.<br>
                <br>
                Maybe better adjust amdgpu_vm_bo_relocated() to move the
                root PD to the <br>
                idle state instead.<br>
                <br>
                Christian.<br>
                <br>
                <br>
                >                       
                amdgpu_vm_bo_relocated(bo_base);<br>
                >                else if (bo->tbo.base.resv ==
                vm->root.base.bo->tbo.base.resv)<br>
                >                        amdgpu_vm_bo_moved(bo_base);<br>
                <br>
              </div>
            </span></font></div>
      </div>
    </blockquote>
    <br>
  </body>
</html>