<html><head><meta http-equiv="content-type" content="text/html; charset=utf-8"></head><body style="overflow-wrap: break-word; -webkit-nbsp-mode: space; line-break: after-white-space;"><div><blockquote type="cite"><div>On Nov 6, 2023, at 22:44, Christian König <christian.koenig@amd.com> wrote:</div><br class="Apple-interchange-newline"><div><meta charset="UTF-8"><span style="caret-color: rgb(0, 0, 0); font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none; float: none; display: inline !important;"> </span><span style="caret-color: rgb(0, 0, 0); font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none; float: none; display: inline !important;">Am 02.11.23 um 15:04 schrieb Tatsuyuki Ishi:</span><br style="caret-color: rgb(0, 0, 0); font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none;"><blockquote type="cite" style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none;">In Vulkan, it is the application's responsibility to perform adequate<br>synchronization before a sparse unmap, replace or BO destroy operation.<br>Until now, the kernel applied the same rule as implicitly-synchronized<br>APIs like OpenGL, which with per-VM BOs made page table updates stall the<br>queue completely. The newly added AMDGPU_VM_EXPLICIT_SYNC flag allows<br>drivers to opt-out of this behavior, while still ensuring adequate implicit<br>sync happens for kernel-initiated updates (e.g. BO moves).<br><br>We record whether to use implicit sync or not for each freed mapping. To<br>avoid increasing the mapping struct's size, this is union-ized with the<br>interval tree field which is unused after the unmap.<br><br>The reason this is done with a GEM ioctl flag, instead of being a VM /<br>context global setting, is that the current libdrm implementation shares<br>the DRM handle even between different kind of drivers (radeonsi vs radv).<br></blockquote><br style="caret-color: rgb(0, 0, 0); font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none;"><span style="caret-color: rgb(0, 0, 0); font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none; float: none; display: inline !important;">It would be nice if we could make this more future prove by not using a flag, but rather a drm_syncobj.</span></div></blockquote><div><br></div><div>There is asynchronous VM_BIND and synchronous VM_BIND. Using syncobjs address asynchronous binds, but what this patch set solves is to add an explicitly synced synchronous bind.</div><div><br></div><div>Even within Vulkan, there are use cases for synchronous binds. This is when a non-sparse BO is destroyed (or created but that’s not synchronized). Such operations should still be explicit sync, unlike OpenGL where it syncs to previous submissions. So adding asynchronous bind doesn’t supersede this need.</div><div><br></div><div>I’ve also thought whether we can just make the unmap asynchronous, since the spec requires that destroyed stuff are not accessed in any way, but I think it will complicate behavior when the destruction of BO immediately follows.</div><div><br></div><div>We should implement asynchronous bind someday to make vkQueueBindSparse work (even) better, but that will likely involve a larger scope including the scheduler. Getting synchronous but explicitly synced binds should be simpler and a good incremental step.</div><div><br></div><div>Tatsuyuki.</div><br><blockquote type="cite"><div><span style="caret-color: rgb(0, 0, 0); font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none; float: none; display: inline !important;">You can extend the drm_amdgpu_gem_va structure by adding a drm_syncobj handle and timeline point at the end.</span><br style="caret-color: rgb(0, 0, 0); font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none;"><br style="caret-color: rgb(0, 0, 0); font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none;"><span style="caret-color: rgb(0, 0, 0); font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none; float: none; display: inline !important;">If the syncobj/timeline point results in a fence we give that as input dependency the operation has to wait for.</span><br style="caret-color: rgb(0, 0, 0); font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none;"><br style="caret-color: rgb(0, 0, 0); font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none;"><span style="caret-color: rgb(0, 0, 0); font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none; float: none; display: inline !important;">And output fence can come later on as well, but that one is much more harder to handle.</span><br style="caret-color: rgb(0, 0, 0); font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none;"><br style="caret-color: rgb(0, 0, 0); font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none;"><span style="caret-color: rgb(0, 0, 0); font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none; float: none; display: inline !important;">Regards,</span><br style="caret-color: rgb(0, 0, 0); font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none;"><span style="caret-color: rgb(0, 0, 0); font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none; float: none; display: inline !important;">Christian.</span><br style="caret-color: rgb(0, 0, 0); font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none;"><br style="caret-color: rgb(0, 0, 0); font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none;"><blockquote type="cite" style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none;"><br>Signed-off-by: Tatsuyuki Ishi <ishitatsuyuki@gmail.com><br>---<br> .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c  |  2 +-<br> drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c       |  2 +-<br> drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c       | 14 ++++--<br> drivers/gpu/drm/amd/amdgpu/amdgpu_object.h    |  7 ++-<br> drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h     |  6 ++-<br> drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c        | 47 +++++++++++--------<br> drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h        | 23 +++++----<br> drivers/gpu/drm/amd/amdkfd/kfd_svm.c          | 18 +++----<br> include/uapi/drm/amdgpu_drm.h                 |  2 +<br> 9 files changed, 71 insertions(+), 50 deletions(-)<br><br>diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c<br>index 7d6daf8d2bfa..10e129bff977 100644<br>--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c<br>+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c<br>@@ -1196,7 +1196,7 @@ static void unmap_bo_from_gpuvm(struct kgd_mem *mem,<br> <span class="Apple-tab-span" style="white-space: pre;">     </span>struct amdgpu_device *adev = entry->adev;<br> <span class="Apple-tab-span" style="white-space: pre;">      </span>struct amdgpu_vm *vm = bo_va->base.vm;<br> -<span class="Apple-tab-span" style="white-space: pre;">        </span>amdgpu_vm_bo_unmap(adev, bo_va, entry->va);<br>+<span class="Apple-tab-span" style="white-space: pre;"> </span>amdgpu_vm_bo_unmap(adev, bo_va, entry->va, true);<br>   <span class="Apple-tab-span" style="white-space: pre;">  </span>amdgpu_vm_clear_freed(adev, vm, &bo_va->last_pt_update);<br> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c<br>index 720011019741..612279e65bff 100644<br>--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c<br>+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c<br>@@ -122,7 +122,7 @@ int amdgpu_unmap_static_csa(struct amdgpu_device *adev, struct amdgpu_vm *vm,<br> <span class="Apple-tab-span" style="white-space: pre;">   </span><span class="Apple-tab-span" style="white-space: pre;">  </span>}<br> <span class="Apple-tab-span" style="white-space: pre;"> </span>}<br> -<span class="Apple-tab-span" style="white-space: pre;">        </span>r = amdgpu_vm_bo_unmap(adev, bo_va, csa_addr);<br>+<span class="Apple-tab-span" style="white-space: pre;"> </span>r = amdgpu_vm_bo_unmap(adev, bo_va, csa_addr, true);<br> <span class="Apple-tab-span" style="white-space: pre;">      </span>if (r) {<br> <span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span>DRM_ERROR("failed to do bo_unmap on static CSA, err=%d\n", r);<br> <span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span>goto error;<br>diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c<br>index a1b15d0d6c48..cca68b89754e 100644<br>--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c<br>+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c<br>@@ -667,9 +667,9 @@ int amdgpu_gem_va_ioctl(struct drm_device *dev, void *data,<br> <span class="Apple-tab-span" style="white-space: pre;">       </span>const uint32_t valid_flags = AMDGPU_VM_DELAY_UPDATE |<br> <span class="Apple-tab-span" style="white-space: pre;">     </span><span class="Apple-tab-span" style="white-space: pre;">  </span>AMDGPU_VM_PAGE_READABLE | AMDGPU_VM_PAGE_WRITEABLE |<br> <span class="Apple-tab-span" style="white-space: pre;">      </span><span class="Apple-tab-span" style="white-space: pre;">  </span>AMDGPU_VM_PAGE_EXECUTABLE | AMDGPU_VM_MTYPE_MASK |<br>-<span class="Apple-tab-span" style="white-space: pre;">     </span><span class="Apple-tab-span" style="white-space: pre;">  </span>AMDGPU_VM_PAGE_NOALLOC;<br>+<span class="Apple-tab-span" style="white-space: pre;">        </span><span class="Apple-tab-span" style="white-space: pre;">  </span>AMDGPU_VM_PAGE_NOALLOC | AMDGPU_VM_EXPLICIT_SYNC;<br> <span class="Apple-tab-span" style="white-space: pre;"> </span>const uint32_t prt_flags = AMDGPU_VM_DELAY_UPDATE |<br>-<span class="Apple-tab-span" style="white-space: pre;">    </span><span class="Apple-tab-span" style="white-space: pre;">  </span>AMDGPU_VM_PAGE_PRT;<br>+<span class="Apple-tab-span" style="white-space: pre;">    </span><span class="Apple-tab-span" style="white-space: pre;">  </span>AMDGPU_VM_PAGE_PRT | AMDGPU_VM_EXPLICIT_SYNC;<br>   <span class="Apple-tab-span" style="white-space: pre;"> </span>struct drm_amdgpu_gem_va *args = data;<br> <span class="Apple-tab-span" style="white-space: pre;">    </span>struct drm_gem_object *gobj;<br>@@ -680,6 +680,7 @@ int amdgpu_gem_va_ioctl(struct drm_device *dev, void *data,<br> <span class="Apple-tab-span" style="white-space: pre;">     </span>struct drm_exec exec;<br> <span class="Apple-tab-span" style="white-space: pre;">     </span>uint64_t va_flags;<br> <span class="Apple-tab-span" style="white-space: pre;">        </span>uint64_t vm_size;<br>+<span class="Apple-tab-span" style="white-space: pre;">      </span>bool sync_unmap;<br> <span class="Apple-tab-span" style="white-space: pre;">  </span>int r = 0;<br>   <span class="Apple-tab-span" style="white-space: pre;">    </span>if (args->va_address < AMDGPU_VA_RESERVED_SIZE) {<br>@@ -715,6 +716,8 @@ int amdgpu_gem_va_ioctl(struct drm_device *dev, void *data,<br> <span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span>return -EINVAL;<br> <span class="Apple-tab-span" style="white-space: pre;">   </span>}<br> +<span class="Apple-tab-span" style="white-space: pre;">        </span>sync_unmap = !(args->flags & AMDGPU_VM_EXPLICIT_SYNC);<br>+<br> <span class="Apple-tab-span" style="white-space: pre;">  </span>switch (args->operation) {<br> <span class="Apple-tab-span" style="white-space: pre;">     </span>case AMDGPU_VA_OP_MAP:<br> <span class="Apple-tab-span" style="white-space: pre;">    </span>case AMDGPU_VA_OP_UNMAP:<br>@@ -774,19 +777,20 @@ int amdgpu_gem_va_ioctl(struct drm_device *dev, void *data,<br> <span class="Apple-tab-span" style="white-space: pre;">       </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-converted-space"> </span>    va_flags);<br> <span class="Apple-tab-span" style="white-space: pre;"> </span><span class="Apple-tab-span" style="white-space: pre;">  </span>break;<br> <span class="Apple-tab-span" style="white-space: pre;">    </span>case AMDGPU_VA_OP_UNMAP:<br>-<span class="Apple-tab-span" style="white-space: pre;">       </span><span class="Apple-tab-span" style="white-space: pre;">  </span>r = amdgpu_vm_bo_unmap(adev, bo_va, args->va_address);<br>+<span class="Apple-tab-span" style="white-space: pre;">      </span><span class="Apple-tab-span" style="white-space: pre;">  </span>r = amdgpu_vm_bo_unmap(adev, bo_va, args->va_address,<br>+<span class="Apple-tab-span" style="white-space: pre;">       </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-converted-space"> </span>      sync_unmap);<br> <span class="Apple-tab-span" style="white-space: pre;">   </span><span class="Apple-tab-span" style="white-space: pre;">  </span>break;<br>   <span class="Apple-tab-span" style="white-space: pre;">        </span>case AMDGPU_VA_OP_CLEAR:<br> <span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span>r = amdgpu_vm_bo_clear_mappings(adev, &fpriv->vm,<br> <span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span>args->va_address,<br>-<span class="Apple-tab-span" style="white-space: pre;">   </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span>args->map_size);<br>+<span class="Apple-tab-span" style="white-space: pre;">    </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span>args->map_size, sync_unmap);<br> <span class="Apple-tab-span" style="white-space: pre;">   </span><span class="Apple-tab-span" style="white-space: pre;">  </span>break;<br> <span class="Apple-tab-span" style="white-space: pre;">    </span>case AMDGPU_VA_OP_REPLACE:<br> <span class="Apple-tab-span" style="white-space: pre;">        </span><span class="Apple-tab-span" style="white-space: pre;">  </span>va_flags = amdgpu_gem_va_map_flags(adev, args->flags);<br> <span class="Apple-tab-span" style="white-space: pre;"> </span><span class="Apple-tab-span" style="white-space: pre;">  </span>r = amdgpu_vm_bo_replace_map(adev, bo_va, args->va_address,<br> <span class="Apple-tab-span" style="white-space: pre;">    </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-converted-space"> </span>    args->offset_in_bo, args->map_size,<br>-<span class="Apple-tab-span" style="white-space: pre;">       </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-converted-space"> </span>    va_flags);<br>+<span class="Apple-tab-span" style="white-space: pre;">      </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-converted-space"> </span>    va_flags, sync_unmap);<br> <span class="Apple-tab-span" style="white-space: pre;">     </span><span class="Apple-tab-span" style="white-space: pre;">  </span>break;<br> <span class="Apple-tab-span" style="white-space: pre;">    </span>default:<br> <span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span>break;<br>diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h<br>index f3ee83cdf97e..28be03f1bbcf 100644<br>--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h<br>+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h<br>@@ -67,7 +67,12 @@ struct amdgpu_bo_va_mapping {<br> <span class="Apple-tab-span" style="white-space: pre;">       </span>struct rb_node<span class="Apple-tab-span" style="white-space: pre;">    </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span>rb;<br> <span class="Apple-tab-span" style="white-space: pre;">       </span>uint64_t<span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span>start;<br> <span class="Apple-tab-span" style="white-space: pre;">    </span>uint64_t<span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span>last;<br>-<span class="Apple-tab-span" style="white-space: pre;">  </span>uint64_t<span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span>__subtree_last;<br>+<span class="Apple-tab-span" style="white-space: pre;">        </span>union {<br>+<span class="Apple-tab-span" style="white-space: pre;">        </span><span class="Apple-tab-span" style="white-space: pre;">  </span>/* BOs in interval tree only */<br>+<span class="Apple-tab-span" style="white-space: pre;">        </span><span class="Apple-tab-span" style="white-space: pre;">  </span>uint64_t<span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span>__subtree_last;<br>+<span class="Apple-tab-span" style="white-space: pre;">        </span><span class="Apple-tab-span" style="white-space: pre;">  </span>/* Freed BOs only */<br>+<span class="Apple-tab-span" style="white-space: pre;">   </span><span class="Apple-tab-span" style="white-space: pre;">  </span>bool<span class="Apple-tab-span" style="white-space: pre;">      </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span>sync_unmap;<br>+<span class="Apple-tab-span" style="white-space: pre;">    </span>};<br> <span class="Apple-tab-span" style="white-space: pre;">        </span>uint64_t<span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span>offset;<br> <span class="Apple-tab-span" style="white-space: pre;">   </span>uint64_t<span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span>flags;<br> };<br>diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h<br>index 2fd1bfb35916..e71443c8c59b 100644<br>--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h<br>+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h<br>@@ -276,6 +276,7 @@ TRACE_EVENT(amdgpu_vm_bo_unmap,<br> <span class="Apple-tab-span" style="white-space: pre;">      </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-converted-space"> </span>    __field(long, last)<br> <span class="Apple-tab-span" style="white-space: pre;">        </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-converted-space"> </span>    __field(u64, offset)<br> <span class="Apple-tab-span" style="white-space: pre;">       </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-converted-space"> </span>    __field(u64, flags)<br>+<span class="Apple-tab-span" style="white-space: pre;">     </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-converted-space"> </span>    __field(bool, sync_unmap)<br> <span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-converted-space"> </span>    ),<br>   <span class="Apple-tab-span" style="white-space: pre;">     </span><span class="Apple-converted-space"> </span>   TP_fast_assign(<br>@@ -284,10 +285,11 @@ TRACE_EVENT(amdgpu_vm_bo_unmap,<br> <span class="Apple-tab-span" style="white-space: pre;">   </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-converted-space"> </span>  __entry->last = mapping->last;<br> <span class="Apple-tab-span" style="white-space: pre;">   </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-converted-space"> </span>  __entry->offset = mapping->offset;<br> <span class="Apple-tab-span" style="white-space: pre;">       </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-converted-space"> </span>  __entry->flags = mapping->flags;<br>+<span class="Apple-tab-span" style="white-space: pre;">      </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-converted-space"> </span>  __entry->sync_unmap = mapping->sync_unmap;<br> <span class="Apple-tab-span" style="white-space: pre;">       </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-converted-space"> </span>  ),<br>-<span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-converted-space"> </span>   TP_printk("bo=%p, start=%lx, last=%lx, offset=%010llx, flags=%llx",<br>+<span class="Apple-tab-span" style="white-space: pre;"> </span><span class="Apple-converted-space"> </span>   TP_printk("bo=%p, start=%lx, last=%lx, offset=%010llx, flags=%llx, sync_unmap=%d",<br> <span class="Apple-tab-span" style="white-space: pre;">     </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-converted-space"> </span>     __entry->bo, __entry->start, __entry->last,<br>-<span class="Apple-tab-span" style="white-space: pre;">      </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-converted-space"> </span>     __entry->offset, __entry->flags)<br>+<span class="Apple-tab-span" style="white-space: pre;">    </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-converted-space"> </span>     __entry->offset, __entry->flags, __entry->sync_unmap)<br> );<br>   DECLARE_EVENT_CLASS(amdgpu_vm_mapping,<br>diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c<br>index 7b9762f1cddd..a74472e16952 100644<br>--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c<br>+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c<br>@@ -844,6 +844,7 @@ static void amdgpu_vm_tlb_seq_cb(struct dma_fence *fence,<br>  * @immediate: immediate submission in a page fault<br>  * @unlocked: unlocked invalidation during MM callback<br>  * @flush_tlb: trigger tlb invalidation after update completed<br>+ * @sync_unmap: wait for BO users before unmapping<br>  * @resv: fences we need to sync to<br>  * @start: start of mapped range<br>  * @last: last mapped entry<br>@@ -861,8 +862,9 @@ static void amdgpu_vm_tlb_seq_cb(struct dma_fence *fence,<br>  */<br> int amdgpu_vm_update_range(struct amdgpu_device *adev, struct amdgpu_vm *vm,<br> <span class="Apple-tab-span" style="white-space: pre;">      </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-converted-space"> </span>  bool immediate, bool unlocked, bool flush_tlb,<br>-<span class="Apple-tab-span" style="white-space: pre;">      </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-converted-space"> </span>  struct dma_resv *resv, uint64_t start, uint64_t last,<br>-<span class="Apple-tab-span" style="white-space: pre;">       </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-converted-space"> </span>  uint64_t flags, uint64_t offset, uint64_t vram_base,<br>+<span class="Apple-tab-span" style="white-space: pre;">        </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-converted-space"> </span>  bool sync_unmap, struct dma_resv *resv,<br>+<span class="Apple-tab-span" style="white-space: pre;">     </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-converted-space"> </span>  uint64_t start, uint64_t last, uint64_t flags,<br>+<span class="Apple-tab-span" style="white-space: pre;">      </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-converted-space"> </span>  uint64_t offset, uint64_t vram_base,<br> <span class="Apple-tab-span" style="white-space: pre;">   </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-converted-space"> </span>  struct ttm_resource *res, dma_addr_t *pages_addr,<br> <span class="Apple-tab-span" style="white-space: pre;">      </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-converted-space"> </span>  struct dma_fence **fence)<br> {<br>@@ -902,7 +904,7 @@ int amdgpu_vm_update_range(struct amdgpu_device *adev, struct amdgpu_vm *vm,<br> <span class="Apple-tab-span" style="white-space: pre;">   </span>/* Implicitly sync to command submissions in the same VM before<br> <span class="Apple-tab-span" style="white-space: pre;">   </span><span class="Apple-converted-space"> </span>* unmapping. Sync to moving fences before mapping.<br> <span class="Apple-tab-span" style="white-space: pre;"> </span><span class="Apple-converted-space"> </span>*/<br>-<span class="Apple-tab-span" style="white-space: pre;">      </span>if (!(flags & (AMDGPU_PTE_VALID | AMDGPU_PTE_PRT)))<br>+<span class="Apple-tab-span" style="white-space: pre;">        </span>if (!(flags & (AMDGPU_PTE_VALID | AMDGPU_PTE_PRT)) && sync_unmap)<br> <span class="Apple-tab-span" style="white-space: pre;">     </span><span class="Apple-tab-span" style="white-space: pre;">  </span>sync_mode = AMDGPU_SYNC_EQ_OWNER;<br> <span class="Apple-tab-span" style="white-space: pre;"> </span>else<br> <span class="Apple-tab-span" style="white-space: pre;">      </span><span class="Apple-tab-span" style="white-space: pre;">  </span>sync_mode = AMDGPU_SYNC_EXPLICIT;<br>@@ -1145,10 +1147,10 @@ int amdgpu_vm_bo_update(struct amdgpu_device *adev, struct amdgpu_bo_va *bo_va,<br> <span class="Apple-tab-span" style="white-space: pre;">        </span><span class="Apple-tab-span" style="white-space: pre;">  </span>trace_amdgpu_vm_bo_update(mapping);<br>   <span class="Apple-tab-span" style="white-space: pre;">   </span><span class="Apple-tab-span" style="white-space: pre;">  </span>r = amdgpu_vm_update_range(adev, vm, false, false, flush_tlb,<br>-<span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-converted-space"> </span>  resv, mapping->start, mapping->last,<br>-<span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-converted-space"> </span>  update_flags, mapping->offset,<br>-<span class="Apple-tab-span" style="white-space: pre;">   </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-converted-space"> </span>  vram_base, mem, pages_addr,<br>-<span class="Apple-tab-span" style="white-space: pre;"> </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-converted-space"> </span>  last_update);<br>+<span class="Apple-tab-span" style="white-space: pre;">       </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-converted-space"> </span>  true, resv, mapping->start,<br>+<span class="Apple-tab-span" style="white-space: pre;">      </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-converted-space"> </span>  mapping->last, update_flags,<br>+<span class="Apple-tab-span" style="white-space: pre;">     </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-converted-space"> </span>  mapping->offset, vram_base, mem,<br>+<span class="Apple-tab-span" style="white-space: pre;"> </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-converted-space"> </span>  pages_addr, last_update);<br> <span class="Apple-tab-span" style="white-space: pre;">      </span><span class="Apple-tab-span" style="white-space: pre;">  </span>if (r)<br> <span class="Apple-tab-span" style="white-space: pre;">    </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span>return r;<br> <span class="Apple-tab-span" style="white-space: pre;"> </span>}<br>@@ -1340,7 +1342,8 @@ int amdgpu_vm_clear_freed(struct amdgpu_device *adev,<br> <span class="Apple-tab-span" style="white-space: pre;">    </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-converted-space"> </span>   mapping->start < AMDGPU_GMC_HOLE_START)<br> <span class="Apple-tab-span" style="white-space: pre;">    </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span>init_pte_value = AMDGPU_PTE_DEFAULT_ATC;<br> -<span class="Apple-tab-span" style="white-space: pre;"> </span><span class="Apple-tab-span" style="white-space: pre;">  </span>r = amdgpu_vm_update_range(adev, vm, false, false, true, resv,<br>+<span class="Apple-tab-span" style="white-space: pre;"> </span><span class="Apple-tab-span" style="white-space: pre;">  </span>r = amdgpu_vm_update_range(adev, vm, false, false, true,<br>+<span class="Apple-tab-span" style="white-space: pre;">       </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-converted-space"> </span>  mapping->sync_unmap, resv,<br> <span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-converted-space"> </span>  mapping->start, mapping->last,<br> <span class="Apple-tab-span" style="white-space: pre;">   </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-converted-space"> </span>  init_pte_value, 0, 0, NULL, NULL,<br> <span class="Apple-tab-span" style="white-space: pre;">      </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-converted-space"> </span>  &f);<br>@@ -1572,6 +1575,7 @@ int amdgpu_vm_bo_map(struct amdgpu_device *adev,<br>  * @offset: requested offset in the BO<br>  * @size: BO size in bytes<br>  * @flags: attributes of pages (read/write/valid/etc.)<br>+ * @sync_unmap: wait for BO users before replacing existing mapping<br>  *<br>  * Add a mapping of the BO at the specefied addr into the VM. Replace existing<br>  * mappings as we do so.<br>@@ -1582,9 +1586,9 @@ int amdgpu_vm_bo_map(struct amdgpu_device *adev,<br>  * Object has to be reserved and unreserved outside!<br>  */<br> int amdgpu_vm_bo_replace_map(struct amdgpu_device *adev,<br>-<span class="Apple-tab-span" style="white-space: pre;"> </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-converted-space"> </span>    struct amdgpu_bo_va *bo_va,<br>-<span class="Apple-tab-span" style="white-space: pre;">     </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-converted-space"> </span>    uint64_t saddr, uint64_t offset,<br>-<span class="Apple-tab-span" style="white-space: pre;">        </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-converted-space"> </span>    uint64_t size, uint64_t flags)<br>+<span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-converted-space"> </span>    struct amdgpu_bo_va *bo_va, uint64_t saddr,<br>+<span class="Apple-tab-span" style="white-space: pre;">     </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-converted-space"> </span>    uint64_t offset, uint64_t size, uint64_t flags,<br>+<span class="Apple-tab-span" style="white-space: pre;"> </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-converted-space"> </span>    bool sync_unmap)<br> {<br> <span class="Apple-tab-span" style="white-space: pre;">  </span>struct amdgpu_bo_va_mapping *mapping;<br> <span class="Apple-tab-span" style="white-space: pre;">     </span>struct amdgpu_bo *bo = bo_va->base.bo;<br>@@ -1608,7 +1612,7 @@ int amdgpu_vm_bo_replace_map(struct amdgpu_device *adev,<br> <span class="Apple-tab-span" style="white-space: pre;"> </span>if (!mapping)<br> <span class="Apple-tab-span" style="white-space: pre;">     </span><span class="Apple-tab-span" style="white-space: pre;">  </span>return -ENOMEM;<br> -<span class="Apple-tab-span" style="white-space: pre;">  </span>r = amdgpu_vm_bo_clear_mappings(adev, bo_va->base.vm, saddr, size);<br>+<span class="Apple-tab-span" style="white-space: pre;"> </span>r = amdgpu_vm_bo_clear_mappings(adev, bo_va->base.vm, saddr, size, sync_unmap);<br> <span class="Apple-tab-span" style="white-space: pre;">        </span>if (r) {<br> <span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span>kfree(mapping);<br> <span class="Apple-tab-span" style="white-space: pre;">   </span><span class="Apple-tab-span" style="white-space: pre;">  </span>return r;<br>@@ -1633,6 +1637,7 @@ int amdgpu_vm_bo_replace_map(struct amdgpu_device *adev,<br>  * @adev: amdgpu_device pointer<br>  * @bo_va: bo_va to remove the address from<br>  * @saddr: where to the BO is mapped<br>+ * @sync_unmap: wait for BO users before unmapping<br>  *<br>  * Remove a mapping of the BO at the specefied addr from the VM.<br>  *<br>@@ -1641,9 +1646,8 @@ int amdgpu_vm_bo_replace_map(struct amdgpu_device *adev,<br>  *<br>  * Object has to be reserved and unreserved outside!<br>  */<br>-int amdgpu_vm_bo_unmap(struct amdgpu_device *adev,<br>-<span class="Apple-tab-span" style="white-space: pre;">    </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-converted-space"> </span>      struct amdgpu_bo_va *bo_va,<br>-<span class="Apple-tab-span" style="white-space: pre;"> </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-converted-space"> </span>      uint64_t saddr)<br>+int amdgpu_vm_bo_unmap(struct amdgpu_device *adev, struct amdgpu_bo_va *bo_va,<br>+<span class="Apple-tab-span" style="white-space: pre;">    </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-converted-space"> </span>      uint64_t saddr, bool sync_unmap)<br> {<br> <span class="Apple-tab-span" style="white-space: pre;">      </span>struct amdgpu_bo_va_mapping *mapping;<br> <span class="Apple-tab-span" style="white-space: pre;">     </span>struct amdgpu_vm *vm = bo_va->base.vm;<br>@@ -1671,6 +1675,7 @@ int amdgpu_vm_bo_unmap(struct amdgpu_device *adev,<br> <span class="Apple-tab-span" style="white-space: pre;">       </span>list_del(&mapping->list);<br> <span class="Apple-tab-span" style="white-space: pre;">  </span>amdgpu_vm_it_remove(mapping, &vm->va);<br> <span class="Apple-tab-span" style="white-space: pre;">     </span>mapping->bo_va = NULL;<br>+<span class="Apple-tab-span" style="white-space: pre;">      </span>mapping->sync_unmap = sync_unmap;<br> <span class="Apple-tab-span" style="white-space: pre;">      </span>trace_amdgpu_vm_bo_unmap(bo_va, mapping);<br>   <span class="Apple-tab-span" style="white-space: pre;">     </span>if (valid)<br>@@ -1689,6 +1694,7 @@ int amdgpu_vm_bo_unmap(struct amdgpu_device *adev,<br>  * @vm: VM structure to use<br>  * @saddr: start of the range<br>  * @size: size of the range<br>+ * @sync_unmap: wait for BO users before unmapping<br>  *<br>  * Remove all mappings in a range, split them as appropriate.<br>  *<br>@@ -1696,8 +1702,8 @@ int amdgpu_vm_bo_unmap(struct amdgpu_device *adev,<br>  * 0 for success, error for failure.<br>  */<br> int amdgpu_vm_bo_clear_mappings(struct amdgpu_device *adev,<br>-<span class="Apple-tab-span" style="white-space: pre;">      </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span>struct amdgpu_vm *vm,<br>-<span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span>uint64_t saddr, uint64_t size)<br>+<span class="Apple-tab-span" style="white-space: pre;"> </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span>struct amdgpu_vm *vm, uint64_t saddr,<br>+<span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span>uint64_t size, bool sync_unmap)<br> {<br> <span class="Apple-tab-span" style="white-space: pre;">  </span>struct amdgpu_bo_va_mapping *before, *after, *tmp, *next;<br> <span class="Apple-tab-span" style="white-space: pre;"> </span>LIST_HEAD(removed);<br>@@ -1761,6 +1767,7 @@ int amdgpu_vm_bo_clear_mappings(struct amdgpu_device *adev,<br> <span class="Apple-tab-span" style="white-space: pre;">    </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-converted-space"> </span>   tmp->last = eaddr;<br>   <span class="Apple-tab-span" style="white-space: pre;">        </span><span class="Apple-tab-span" style="white-space: pre;">  </span>tmp->bo_va = NULL;<br>+<span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span>tmp->sync_unmap = sync_unmap;<br> <span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span>list_add(&tmp->list, &vm->freed);<br> <span class="Apple-tab-span" style="white-space: pre;">   </span><span class="Apple-tab-span" style="white-space: pre;">  </span>trace_amdgpu_vm_bo_unmap(NULL, tmp);<br> <span class="Apple-tab-span" style="white-space: pre;">      </span>}<br>@@ -1889,6 +1896,7 @@ void amdgpu_vm_bo_del(struct amdgpu_device *adev,<br> <span class="Apple-tab-span" style="white-space: pre;">        </span><span class="Apple-tab-span" style="white-space: pre;">  </span>list_del(&mapping->list);<br> <span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span>amdgpu_vm_it_remove(mapping, &vm->va);<br> <span class="Apple-tab-span" style="white-space: pre;">     </span><span class="Apple-tab-span" style="white-space: pre;">  </span>mapping->bo_va = NULL;<br>+<span class="Apple-tab-span" style="white-space: pre;">      </span><span class="Apple-tab-span" style="white-space: pre;">  </span>mapping->sync_unmap = true;<br> <span class="Apple-tab-span" style="white-space: pre;">    </span><span class="Apple-tab-span" style="white-space: pre;">  </span>trace_amdgpu_vm_bo_unmap(bo_va, mapping);<br> <span class="Apple-tab-span" style="white-space: pre;"> </span><span class="Apple-tab-span" style="white-space: pre;">  </span>list_add(&mapping->list, &vm->freed);<br> <span class="Apple-tab-span" style="white-space: pre;">       </span>}<br>@@ -2617,8 +2625,9 @@ bool amdgpu_vm_handle_fault(struct amdgpu_device *adev, u32 pasid,<br> <span class="Apple-tab-span" style="white-space: pre;">       </span><span class="Apple-tab-span" style="white-space: pre;">  </span>goto error_unlock;<br> <span class="Apple-tab-span" style="white-space: pre;">        </span>}<br> -<span class="Apple-tab-span" style="white-space: pre;">        </span>r = amdgpu_vm_update_range(adev, vm, true, false, false, NULL, addr,<br>-<span class="Apple-tab-span" style="white-space: pre;">   </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-converted-space"> </span>  addr, flags, value, 0, NULL, NULL, NULL);<br>+<span class="Apple-tab-span" style="white-space: pre;">   </span>r = amdgpu_vm_update_range(adev, vm, true, false, false, true, NULL,<br>+<span class="Apple-tab-span" style="white-space: pre;">   </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-converted-space"> </span>  addr, addr, flags, value, 0, NULL, NULL,<br>+<span class="Apple-tab-span" style="white-space: pre;">    </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-converted-space"> </span>  NULL);<br> <span class="Apple-tab-span" style="white-space: pre;"> </span>if (r)<br> <span class="Apple-tab-span" style="white-space: pre;">    </span><span class="Apple-tab-span" style="white-space: pre;">  </span>goto error_unlock;<br> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h<br>index 204ab13184ed..73b7b49fdb2e 100644<br>--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h<br>+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h<br>@@ -423,12 +423,12 @@ void amdgpu_vm_bo_base_init(struct amdgpu_vm_bo_base *base,<br> <span class="Apple-tab-span" style="white-space: pre;">    </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-converted-space"> </span>   struct amdgpu_vm *vm, struct amdgpu_bo *bo);<br> int amdgpu_vm_update_range(struct amdgpu_device *adev, struct amdgpu_vm *vm,<br> <span class="Apple-tab-span" style="white-space: pre;"> </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-converted-space"> </span>  bool immediate, bool unlocked, bool flush_tlb,<br>-<span class="Apple-tab-span" style="white-space: pre;">      </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-converted-space"> </span>  struct dma_resv *resv, uint64_t start, uint64_t last,<br>-<span class="Apple-tab-span" style="white-space: pre;">       </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-converted-space"> </span>  uint64_t flags, uint64_t offset, uint64_t vram_base,<br>+<span class="Apple-tab-span" style="white-space: pre;">        </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-converted-space"> </span>  bool sync_unmap, struct dma_resv *resv,<br>+<span class="Apple-tab-span" style="white-space: pre;">     </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-converted-space"> </span>  uint64_t start, uint64_t last, uint64_t flags,<br>+<span class="Apple-tab-span" style="white-space: pre;">      </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-converted-space"> </span>  uint64_t offset, uint64_t vram_base,<br> <span class="Apple-tab-span" style="white-space: pre;">   </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-converted-space"> </span>  struct ttm_resource *res, dma_addr_t *pages_addr,<br> <span class="Apple-tab-span" style="white-space: pre;">      </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-converted-space"> </span>  struct dma_fence **fence);<br>-int amdgpu_vm_bo_update(struct amdgpu_device *adev,<br>-<span class="Apple-tab-span" style="white-space: pre;">    </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span>struct amdgpu_bo_va *bo_va,<br>+int amdgpu_vm_bo_update(struct amdgpu_device *adev, struct amdgpu_bo_va *bo_va,<br> <span class="Apple-tab-span" style="white-space: pre;">     </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span>bool clear);<br> bool amdgpu_vm_evictable(struct amdgpu_bo *bo);<br> void amdgpu_vm_bo_invalidate(struct amdgpu_device *adev,<br>@@ -444,15 +444,14 @@ int amdgpu_vm_bo_map(struct amdgpu_device *adev,<br> <span class="Apple-tab-span" style="white-space: pre;">       </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-converted-space"> </span>    uint64_t addr, uint64_t offset,<br> <span class="Apple-tab-span" style="white-space: pre;">    </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-converted-space"> </span>    uint64_t size, uint64_t flags);<br> int amdgpu_vm_bo_replace_map(struct amdgpu_device *adev,<br>-<span class="Apple-tab-span" style="white-space: pre;"> </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-converted-space"> </span>    struct amdgpu_bo_va *bo_va,<br>-<span class="Apple-tab-span" style="white-space: pre;">     </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-converted-space"> </span>    uint64_t addr, uint64_t offset,<br>-<span class="Apple-tab-span" style="white-space: pre;"> </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-converted-space"> </span>    uint64_t size, uint64_t flags);<br>-int amdgpu_vm_bo_unmap(struct amdgpu_device *adev,<br>-<span class="Apple-tab-span" style="white-space: pre;">    </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-converted-space"> </span>      struct amdgpu_bo_va *bo_va,<br>-<span class="Apple-tab-span" style="white-space: pre;"> </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-converted-space"> </span>      uint64_t addr);<br>+<span class="Apple-tab-span" style="white-space: pre;">     </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-converted-space"> </span>    struct amdgpu_bo_va *bo_va, uint64_t addr,<br>+<span class="Apple-tab-span" style="white-space: pre;">      </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-converted-space"> </span>    uint64_t offset, uint64_t size, uint64_t flags,<br>+<span class="Apple-tab-span" style="white-space: pre;"> </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-converted-space"> </span>    bool sync_unmap);<br>+int amdgpu_vm_bo_unmap(struct amdgpu_device *adev, struct amdgpu_bo_va *bo_va,<br>+<span class="Apple-tab-span" style="white-space: pre;">      </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-converted-space"> </span>      uint64_t addr, bool sync_unmap);<br> int amdgpu_vm_bo_clear_mappings(struct amdgpu_device *adev,<br>-<span class="Apple-tab-span" style="white-space: pre;"> </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span>struct amdgpu_vm *vm,<br>-<span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span>uint64_t saddr, uint64_t size);<br>+<span class="Apple-tab-span" style="white-space: pre;">        </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span>struct amdgpu_vm *vm, uint64_t saddr,<br>+<span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span>uint64_t size, bool sync_unmap);<br> struct amdgpu_bo_va_mapping *amdgpu_vm_bo_lookup_mapping(struct amdgpu_vm *vm,<br> <span class="Apple-tab-span" style="white-space: pre;">    </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-converted-space"> </span>uint64_t addr);<br> void amdgpu_vm_bo_trace_cs(struct amdgpu_vm *vm, struct ww_acquire_ctx *ticket);<br>diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c<br>index bb16b795d1bc..6eb4a0a4bc84 100644<br>--- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c<br>+++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c<br>@@ -1291,9 +1291,9 @@ svm_range_unmap_from_gpu(struct amdgpu_device *adev, struct amdgpu_vm *vm,<br>   <span class="Apple-tab-span" style="white-space: pre;">   </span>pr_debug("[0x%llx 0x%llx]\n", start, last);<br> -<span class="Apple-tab-span" style="white-space: pre;">    </span>return amdgpu_vm_update_range(adev, vm, false, true, true, NULL, start,<br>-<span class="Apple-tab-span" style="white-space: pre;">        </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-converted-space"> </span>     last, init_pte_value, 0, 0, NULL, NULL,<br>-<span class="Apple-tab-span" style="white-space: pre;">   </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-converted-space"> </span>     fence);<br>+<span class="Apple-tab-span" style="white-space: pre;">   </span>return amdgpu_vm_update_range(adev, vm, false, true, true, true, NULL,<br>+<span class="Apple-tab-span" style="white-space: pre;"> </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-converted-space"> </span>     start, last, init_pte_value, 0, 0, NULL,<br>+<span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-converted-space"> </span>     NULL, fence);<br> }<br>   static int<br>@@ -1398,12 +1398,12 @@ svm_range_map_to_gpu(struct kfd_process_device *pdd, struct svm_range *prange,<br> <span class="Apple-tab-span" style="white-space: pre;"> </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-converted-space"> </span>* different memory partition based on fpfn/lpfn, we should use<br> <span class="Apple-tab-span" style="white-space: pre;">     </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-converted-space"> </span>* same vm_manager.vram_base_offset regardless memory partition.<br> <span class="Apple-tab-span" style="white-space: pre;">    </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-converted-space"> </span>*/<br>-<span class="Apple-tab-span" style="white-space: pre;">      </span><span class="Apple-tab-span" style="white-space: pre;">  </span>r = amdgpu_vm_update_range(adev, vm, false, false, flush_tlb, NULL,<br>-<span class="Apple-tab-span" style="white-space: pre;">    </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-converted-space"> </span>  last_start, prange->start + i,<br>-<span class="Apple-tab-span" style="white-space: pre;">   </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-converted-space"> </span>  pte_flags,<br>-<span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-converted-space"> </span>  (last_start - prange->start) << PAGE_SHIFT,<br>-<span class="Apple-tab-span" style="white-space: pre;">        </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-converted-space"> </span>  bo_adev ? bo_adev->vm_manager.vram_base_offset : 0,<br>-<span class="Apple-tab-span" style="white-space: pre;">      </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-converted-space"> </span>  NULL, dma_addr, &vm->last_update);<br>+<span class="Apple-tab-span" style="white-space: pre;">   </span><span class="Apple-tab-span" style="white-space: pre;">  </span>r = amdgpu_vm_update_range(<br>+<span class="Apple-tab-span" style="white-space: pre;">    </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span>adev, vm, false, false, flush_tlb, true, NULL,<br>+<span class="Apple-tab-span" style="white-space: pre;"> </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span>last_start, prange->start + i, pte_flags,<br>+<span class="Apple-tab-span" style="white-space: pre;">   </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span>(last_start - prange->start) << PAGE_SHIFT,<br>+<span class="Apple-tab-span" style="white-space: pre;">   </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span>bo_adev ? bo_adev->vm_manager.vram_base_offset : 0,<br>+<span class="Apple-tab-span" style="white-space: pre;"> </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span>NULL, dma_addr, &vm->last_update);<br>   <span class="Apple-tab-span" style="white-space: pre;">     </span><span class="Apple-tab-span" style="white-space: pre;">  </span>for (j = last_start - prange->start; j <= i; j++)<br> <span class="Apple-tab-span" style="white-space: pre;">   </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span>dma_addr[j] |= last_domain;<br>diff --git a/include/uapi/drm/amdgpu_drm.h b/include/uapi/drm/amdgpu_drm.h<br>index f477eda6a2b8..3cdcc299956e 100644<br>--- a/include/uapi/drm/amdgpu_drm.h<br>+++ b/include/uapi/drm/amdgpu_drm.h<br>@@ -556,6 +556,8 @@ struct drm_amdgpu_gem_op {<br> #define AMDGPU_VM_MTYPE_RW<span class="Apple-tab-span" style="white-space: pre;">      </span><span class="Apple-tab-span" style="white-space: pre;">  </span>(5 << 5)<br> /* don't allocate MALL */<br> #define AMDGPU_VM_PAGE_NOALLOC<span class="Apple-tab-span" style="white-space: pre;">     </span><span class="Apple-tab-span" style="white-space: pre;">  </span>(1 << 9)<br>+/* don't sync on unmap */<br>+#define AMDGPU_VM_EXPLICIT_SYNC<span class="Apple-tab-span" style="white-space: pre;">      </span><span class="Apple-tab-span" style="white-space: pre;">  </span>(1 << 10)<br>   struct drm_amdgpu_gem_va {<br> <span class="Apple-tab-span" style="white-space: pre;">     </span>/** GEM object handle */</blockquote></div></blockquote></div><br></body></html>