[PATCH 00/34] Add HMM-based SVM memory manager to KFD v4

Felix Kuehling Felix.Kuehling at amd.com
Tue Apr 6 01:45:55 UTC 2021


Rebased on upstream. Dropped already upstream patch
"drm/amdgpu: reserve fence slot to update page table".

Added more fixes:
- Fixed kernel test robot warnings about static functions
- Fixed a kernel test robot warning about an unused variable
- Fixed a kernel test robot warning about select DEVICE_PRIVATE.
  Using "depends on" now. (see patch 34)
- Proportionally longer timeout for hmm_range_fault on large address ranges
  (see patch 6)
- Fixed PTE flags for XGMI mappings on Arcturus and Aldebaran (see patch 17)
- Fixed range-list cleanup on process termination to avoid BUGs from dangling
  interval notifiers (see patch 16)
- Fixed SVM range locking and interval notifier sequence update
  (see patch 8 and related tweaks in patches 10, 11, 21)

Added my Reviewed-by to all patches primarily authored by Philip and Alex.
I believe this patch series is nearly ready to go.

This series and the corresponding ROCm Thunk and KFDTest changes are also
available on gitub and patchwork.

Link: https://github.com/RadeonOpenCompute/ROCK-Kernel-Driver/tree/fxkamd/hmm-wip
Link: https://github.com/RadeonOpenCompute/ROCT-Thunk-Interface/tree/fxkamd/hmm-wip
Link: https://patchwork.freedesktop.org/series/85563/
CC: Jérôme Glisse <jglisse at redhat.com>
CC: Jason Gunthorpe <jgg at ziepe.ca>

Alex Sierra (9):
  drm/amdkfd: helper to convert gpu id and idx
  drm/amdkfd: add xnack enabled flag to kfd_process
  drm/amdkfd: add ioctl to configure and query xnack retries
  drm/amdgpu: enable 48-bit IH timestamp counter
  drm/amdkfd: SVM API call to restore page tables
  drm/amdkfd: add svm_bo reference for eviction fence
  drm/amdgpu: add param bit flag to create SVM BOs
  drm/amdgpu: svm bo enable_signal call condition
  drm/amdgpu: add svm_bo eviction to enable_signal cb

Felix Kuehling (13):
  drm/amdkfd: map svm range to GPUs
  drm/amdkfd: svm range eviction and restore
  drm/amdgpu: Enable retry faults unconditionally on Aldebaran
  drm/amdkfd: validate vram svm range from TTM
  drm/amdkfd: HMM migrate ram to vram
  drm/amdkfd: HMM migrate vram to ram
  drm/amdkfd: invalidate tables on page retry fault
  drm/amdkfd: page table restore through svm API
  drm/amdkfd: add svm_bo eviction mechanism support
  drm/amdkfd: refine migration policy with xnack on
  drm/amdkfd: add svm range validate timestamp
  drm/amdkfd: multiple gpu migrate vram to vram
  drm/amdkfd: Add CONFIG_HSA_AMD_SVM

Philip Yang (12):
  drm/amdkfd: add svm ioctl API
  drm/amdkfd: register svm range
  drm/amdkfd: add svm ioctl GET_ATTR op
  drm/amdgpu: add common HMM get pages function
  drm/amdkfd: support larger svm range allocation
  drm/amdkfd: validate svm range system memory
  drm/amdkfd: deregister svm range
  drm/amdgpu: export vm update mapping interface
  drm/amdkfd: register HMM device private zone
  drm/amdkfd: support xgmi same hive mapping
  drm/amdkfd: copy memory through gart table
  drm/amdkfd: Add SVM API support capability bits

 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h    |    4 +-
 .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_fence.c  |   16 +-
 .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c  |   13 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c       |    3 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c        |   86 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_mn.h        |    7 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_object.h    |    4 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c       |   90 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c        |   38 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h        |   11 +
 drivers/gpu/drm/amd/amdgpu/gfxhub_v1_0.c      |    8 +-
 drivers/gpu/drm/amd/amdgpu/mmhub_v1_7.c       |    6 +-
 drivers/gpu/drm/amd/amdgpu/vega10_ih.c        |    1 +
 drivers/gpu/drm/amd/amdkfd/Kconfig            |   13 +
 drivers/gpu/drm/amd/amdkfd/Makefile           |    5 +
 drivers/gpu/drm/amd/amdkfd/kfd_chardev.c      |   64 +
 drivers/gpu/drm/amd/amdkfd/kfd_device.c       |    4 +
 .../amd/amdkfd/kfd_device_queue_manager_v9.c  |   13 +-
 drivers/gpu/drm/amd/amdkfd/kfd_flat_memory.c  |    4 +
 drivers/gpu/drm/amd/amdkfd/kfd_migrate.c      |  922 ++++++
 drivers/gpu/drm/amd/amdkfd/kfd_migrate.h      |   64 +
 drivers/gpu/drm/amd/amdkfd/kfd_priv.h         |   36 +
 drivers/gpu/drm/amd/amdkfd/kfd_process.c      |   82 +
 drivers/gpu/drm/amd/amdkfd/kfd_svm.c          | 2906 +++++++++++++++++
 drivers/gpu/drm/amd/amdkfd/kfd_svm.h          |  205 ++
 drivers/gpu/drm/amd/amdkfd/kfd_topology.c     |    6 +
 drivers/gpu/drm/amd/amdkfd/kfd_topology.h     |   10 +-
 include/uapi/linux/kfd_ioctl.h                |  171 +-
 28 files changed, 4686 insertions(+), 106 deletions(-)
 create mode 100644 drivers/gpu/drm/amd/amdkfd/kfd_migrate.c
 create mode 100644 drivers/gpu/drm/amd/amdkfd/kfd_migrate.h
 create mode 100644 drivers/gpu/drm/amd/amdkfd/kfd_svm.c
 create mode 100644 drivers/gpu/drm/amd/amdkfd/kfd_svm.h

-- 
2.31.1



More information about the amd-gfx mailing list