[RFC PATCH 00/28] Introduce GPU SVM and Xe SVM implementation
Matthew Brost
matthew.brost at intel.com
Wed Aug 28 02:48:33 UTC 2024
Continuation of SVM work by Oak Zeng [1][2] based on community feedback.
Introduces GPU SVM layer and new Xe uAPI. Supports GPU page faults for
system allocations (e.g., malloc), runtime allocations (e.g., binding a
BO), migration to and from VRAM, and unified eviction (BO and SVM VRAM
allocations can evict each other). Fully tested; more on this below.
The patch breakdown is as follows:
1. Preparation patches already on the list [3].
- Patches 1-3.
- Please refrain from reviewing these here.
2. New migrate layer functionality
- Patch 4.
- Required for eviction to avoid locking inversion between
dma-resv and mmap lock.
3. GPU SVM.
- Patch 5.
- This is what needs community review.
- Inspired by GPUVM.
- Kernel doc should explain design principles.
- There is certainly room for optimization of the implementation
and improvements with existing core MM interaction. Pulling in
pending DMA mapping work [4] and additional core MM support
for SVM is also likely desired. However, this serves as a good
starting point for any SVM discussions and could be used as a
stepping stone to future core MM work.
3. Basic SVM support in Xe (i.e., SRAM backing only).
- Patches 6-15.
- The uAPI in the patch could benefit from community input.
4. SVM VRAM migration support in Xe.
- Patches 16-23.
- Using TMM BOs for SVM VRAM allocations could use community
input. Patch 23 has a detailed explaination of this design
choice in the commit message.
5. SVM eviction support in Xe.
- Patch 24.
- Should work with exhaustive eviction [5] when it merges.
6. Xe SVM debug / tuning.
- Patch 25-28.
Kernel documentation and commit messages are relatively light, aside
from GPU SVM and uAPI patches as this is an RFC.
Testing has been conducted quite thoroughly with new IGT [6]. Various
system allocation types (malloc, mmap, mmap flags, huge pages, different
sizes, different alignments), mixing runtime allocations, unmapping
corners, invalid faults, and eviction have been tested. Testing scales
from single thread to multiple threads and multiple processes. Tests
pass on LNL, BMG, PVC 1 tile, and PVC 2 tile.
1. Multiple GPU support.
- This is likely to follow or occur in parallel to this work.
2. Userptr unification with GPU SVM.
- This is essentially designed in my head (likely involving a
few new GPU SVM layer functions) but would require some fairly
invasive changes to Xe KMD to test out. Therefore, I would
like GPU SVM to be reviewed first before proceeding with these
changes.
3. Madvise and prefetch IOCTLs
- This is likely to follow or occur in parallel to this work.
Given the size of the series, I have pushed a GitLab branch for
reference [7].
Matt
[1] https://patchwork.freedesktop.org/series/128910/
[2] https://patchwork.freedesktop.org/series/132229/
[3] https://patchwork.freedesktop.org/series/137805/
[4] https://lore.kernel.org/linux-rdma/cover.1709635535.git.leon@kernel.org/
[5] https://patchwork.freedesktop.org/series/133643/
[6] https://patchwork.freedesktop.org/patch/610942/?series=137545&rev=2
[7] https://gitlab.freedesktop.org/mbrost/xe-kernel-driver-svm-post-8-27-24/-/tree/post?ref_type=heads
Matthew Brost (28):
dma-buf: Split out dma fence array create into alloc and arm functions
drm/xe: Invalidate media_gt TLBs in PT code
drm/xe: Retry BO allocation
mm/migrate: Add migrate_device_vma_range
drm/gpusvm: Add support for GPU Shared Virtual Memory
drm/xe/uapi: Add DRM_XE_VM_BIND_FLAG_SYSTEM_ALLOCATON flag
drm/xe: Add SVM init / fini to faulting VMs
drm/xe: Add dma_addr res cursor
drm/xe: Add SVM range invalidation
drm/gpuvm: Add DRM_GPUVA_OP_USER
drm/xe: Add (re)bind to SVM page fault handler
drm/xe: Add SVM garbage collector
drm/xe: Add unbind to SVM garbage collector
drm/xe: Do not allow system allocator VMA unbind if the GPU has
bindings
drm/xe: Enable system allocator uAPI
drm/xe: Add migrate layer functions for SVM support
drm/xe: Add SVM device memory mirroring
drm/xe: Add GPUSVM copy SRAM / VRAM vfunc functions
drm/xe: Update PT layer to understand ranges in VRAM
drm/xe: Add Xe SVM populate_vram_pfn vfunc
drm/xe: Add Xe SVM vram_release vfunc
drm/xe: Add BO flags required for SVM
drm/xe: Add SVM VRAM migration
drm/xe: Basic SVM BO eviction
drm/xe: Add SVM debug
drm/xe: Add modparam for SVM notifier size
drm/xe: Add modparam for SVM prefault
drm/gpusvm: Ensure all pages migrated upon eviction
drivers/dma-buf/dma-fence-array.c | 78 +-
drivers/gpu/drm/xe/Makefile | 4 +-
drivers/gpu/drm/xe/drm_gpusvm.c | 2213 ++++++++++++++++++++++++++
drivers/gpu/drm/xe/drm_gpusvm.h | 415 +++++
drivers/gpu/drm/xe/xe_bo.c | 54 +-
drivers/gpu/drm/xe/xe_bo.h | 2 +
drivers/gpu/drm/xe/xe_bo_types.h | 3 +
drivers/gpu/drm/xe/xe_device_types.h | 8 +
drivers/gpu/drm/xe/xe_gt_pagefault.c | 17 +-
drivers/gpu/drm/xe/xe_migrate.c | 150 ++
drivers/gpu/drm/xe/xe_migrate.h | 10 +
drivers/gpu/drm/xe/xe_module.c | 7 +
drivers/gpu/drm/xe/xe_module.h | 2 +
drivers/gpu/drm/xe/xe_pt.c | 456 +++++-
drivers/gpu/drm/xe/xe_pt.h | 3 +
drivers/gpu/drm/xe/xe_pt_types.h | 2 +
drivers/gpu/drm/xe/xe_res_cursor.h | 50 +-
drivers/gpu/drm/xe/xe_svm.c | 775 +++++++++
drivers/gpu/drm/xe/xe_svm.h | 70 +
drivers/gpu/drm/xe/xe_tile.c | 5 +
drivers/gpu/drm/xe/xe_vm.c | 286 +++-
drivers/gpu/drm/xe/xe_vm.h | 15 +-
drivers/gpu/drm/xe/xe_vm_types.h | 44 +
include/drm/drm_gpuvm.h | 5 +
include/linux/dma-fence-array.h | 6 +
include/linux/migrate.h | 3 +
include/uapi/drm/xe_drm.h | 19 +-
mm/migrate_device.c | 53 +
28 files changed, 4615 insertions(+), 140 deletions(-)
create mode 100644 drivers/gpu/drm/xe/drm_gpusvm.c
create mode 100644 drivers/gpu/drm/xe/drm_gpusvm.h
create mode 100644 drivers/gpu/drm/xe/xe_svm.c
create mode 100644 drivers/gpu/drm/xe/xe_svm.h
--
2.34.1
More information about the dri-devel
mailing list