[PATCH v4 00/40] drm/msm: sparse / "VM_BIND" support
Rob Clark
robdclark at gmail.com
Wed May 14 17:13:22 UTC 2025
hmm, looks like git-send-email died with a TLS error a quarter of the
way thru this series.. I'll try to resend later
BR,
-R
On Wed, May 14, 2025 at 10:03 AM Rob Clark <robdclark at gmail.com> wrote:
>
> From: Rob Clark <robdclark at chromium.org>
>
> Conversion to DRM GPU VA Manager[1], and adding support for Vulkan Sparse
> Memory[2] in the form of:
>
> 1. A new VM_BIND submitqueue type for executing VM MSM_SUBMIT_BO_OP_MAP/
> MAP_NULL/UNMAP commands
>
> 2. A new VM_BIND ioctl to allow submitting batches of one or more
> MAP/MAP_NULL/UNMAP commands to a VM_BIND submitqueue
>
> I did not implement support for synchronous VM_BIND commands. Since
> userspace could just immediately wait for the `SUBMIT` to complete, I don't
> think we need this extra complexity in the kernel. Synchronous/immediate
> VM_BIND operations could be implemented with a 2nd VM_BIND submitqueue.
>
> The corresponding mesa MR: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32533
>
> Changes in v4:
> - Various locking/etc fixes
> - Optimize the pgtable preallocation. If userspace sorts the VM_BIND ops
> then the kernel detects ops that fall into the same 2MB last level PTD
> to avoid duplicate page preallocation.
> - Add way to throttle pushing jobs to the scheduler, to cap the amount of
> potentially temporary prealloc'd pgtable pages.
> - Add vm_log to devcoredump for debugging. If the vm_log_shift module
> param is set, keep a log of the last 1<<vm_log_shift VM updates for
> easier debugging of faults/crashes.
> - Link to v3: https://lore.kernel.org/all/20250428205619.227835-1-robdclark@gmail.com/
>
> Changes in v3:
> - Switched to seperate VM_BIND ioctl. This makes the UABI a bit
> cleaner, but OTOH the userspace code was cleaner when the end result
> of either type of VkQueue lead to the same ioctl. So I'm a bit on
> the fence.
> - Switched to doing the gpuvm bookkeeping synchronously, and only
> deferring the pgtable updates. This avoids needing to hold any resv
> locks in the fence signaling path, resolving the last shrinker related
> lockdep complaints. OTOH it means userspace can trigger invalid
> pgtable updates with multiple VM_BIND queues. In this case, we ensure
> that unmaps happen completely (to prevent userspace from using this to
> access free'd pages), mark the context as unusable, and move on with
> life.
> - Link to v2: https://lore.kernel.org/all/20250319145425.51935-1-robdclark@gmail.com/
>
> Changes in v2:
> - Dropped Bibek Kumar Patro's arm-smmu patches[3], which have since been
> merged.
> - Pre-allocate all the things, and drop HACK patch which disabled shrinker.
> This includes ensuring that vm_bo objects are allocated up front, pre-
> allocating VMA objects, and pre-allocating pages used for pgtable updates.
> The latter utilizes io_pgtable_cfg callbacks for pgtable alloc/free, that
> were initially added for panthor.
> - Add back support for BO dumping for devcoredump.
> - Link to v1 (RFC): https://lore.kernel.org/dri-devel/20241207161651.410556-1-robdclark@gmail.com/T/#t
>
> [1] https://www.kernel.org/doc/html/next/gpu/drm-mm.html#drm-gpuvm
> [2] https://docs.vulkan.org/spec/latest/chapters/sparsemem.html
> [3] https://patchwork.kernel.org/project/linux-arm-kernel/list/?series=909700
>
> Rob Clark (40):
> drm/gpuvm: Don't require obj lock in destructor path
> drm/gpuvm: Allow VAs to hold soft reference to BOs
> drm/gem: Add ww_acquire_ctx support to drm_gem_lru_scan()
> drm/sched: Add enqueue credit limit
> iommu/io-pgtable-arm: Add quirk to quiet WARN_ON()
> drm/msm: Rename msm_file_private -> msm_context
> drm/msm: Improve msm_context comments
> drm/msm: Rename msm_gem_address_space -> msm_gem_vm
> drm/msm: Remove vram carveout support
> drm/msm: Collapse vma allocation and initialization
> drm/msm: Collapse vma close and delete
> drm/msm: Don't close VMAs on purge
> drm/msm: drm_gpuvm conversion
> drm/msm: Convert vm locking
> drm/msm: Use drm_gpuvm types more
> drm/msm: Split out helper to get iommu prot flags
> drm/msm: Add mmu support for non-zero offset
> drm/msm: Add PRR support
> drm/msm: Rename msm_gem_vma_purge() -> _unmap()
> drm/msm: Drop queued submits on lastclose()
> drm/msm: Lazily create context VM
> drm/msm: Add opt-in for VM_BIND
> drm/msm: Mark VM as unusable on GPU hangs
> drm/msm: Add _NO_SHARE flag
> drm/msm: Crashdump prep for sparse mappings
> drm/msm: rd dumping prep for sparse mappings
> drm/msm: Crashdec support for sparse
> drm/msm: rd dumping support for sparse
> drm/msm: Extract out syncobj helpers
> drm/msm: Use DMA_RESV_USAGE_BOOKKEEP/KERNEL
> drm/msm: Add VM_BIND submitqueue
> drm/msm: Support IO_PGTABLE_QUIRK_NO_WARN_ON
> drm/msm: Support pgtable preallocation
> drm/msm: Split out map/unmap ops
> drm/msm: Add VM_BIND ioctl
> drm/msm: Add VM logging for VM_BIND updates
> drm/msm: Add VMA unmap reason
> drm/msm: Add mmu prealloc tracepoint
> drm/msm: use trylock for debugfs
> drm/msm: Bump UAPI version
>
> drivers/gpu/drm/drm_gem.c | 14 +-
> drivers/gpu/drm/drm_gpuvm.c | 15 +-
> drivers/gpu/drm/msm/Kconfig | 1 +
> drivers/gpu/drm/msm/Makefile | 1 +
> drivers/gpu/drm/msm/adreno/a2xx_gpu.c | 25 +-
> drivers/gpu/drm/msm/adreno/a2xx_gpummu.c | 5 +-
> drivers/gpu/drm/msm/adreno/a3xx_gpu.c | 17 +-
> drivers/gpu/drm/msm/adreno/a4xx_gpu.c | 17 +-
> drivers/gpu/drm/msm/adreno/a5xx_debugfs.c | 4 +-
> drivers/gpu/drm/msm/adreno/a5xx_gpu.c | 22 +-
> drivers/gpu/drm/msm/adreno/a5xx_power.c | 2 +-
> drivers/gpu/drm/msm/adreno/a5xx_preempt.c | 10 +-
> drivers/gpu/drm/msm/adreno/a6xx_gmu.c | 32 +-
> drivers/gpu/drm/msm/adreno/a6xx_gmu.h | 2 +-
> drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 49 +-
> drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c | 6 +-
> drivers/gpu/drm/msm/adreno/a6xx_preempt.c | 10 +-
> drivers/gpu/drm/msm/adreno/adreno_device.c | 4 -
> drivers/gpu/drm/msm/adreno/adreno_gpu.c | 99 +-
> drivers/gpu/drm/msm/adreno/adreno_gpu.h | 23 +-
> .../drm/msm/disp/dpu1/dpu_encoder_phys_wb.c | 14 +-
> drivers/gpu/drm/msm/disp/dpu1/dpu_formats.c | 18 +-
> drivers/gpu/drm/msm/disp/dpu1/dpu_formats.h | 2 +-
> drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c | 18 +-
> drivers/gpu/drm/msm/disp/dpu1/dpu_plane.c | 14 +-
> drivers/gpu/drm/msm/disp/dpu1/dpu_plane.h | 4 +-
> drivers/gpu/drm/msm/disp/mdp4/mdp4_crtc.c | 6 +-
> drivers/gpu/drm/msm/disp/mdp4/mdp4_kms.c | 28 +-
> drivers/gpu/drm/msm/disp/mdp4/mdp4_plane.c | 12 +-
> drivers/gpu/drm/msm/disp/mdp5/mdp5_crtc.c | 4 +-
> drivers/gpu/drm/msm/disp/mdp5/mdp5_kms.c | 19 +-
> drivers/gpu/drm/msm/disp/mdp5/mdp5_plane.c | 12 +-
> drivers/gpu/drm/msm/dsi/dsi_host.c | 14 +-
> drivers/gpu/drm/msm/msm_drv.c | 184 +--
> drivers/gpu/drm/msm/msm_drv.h | 35 +-
> drivers/gpu/drm/msm/msm_fb.c | 18 +-
> drivers/gpu/drm/msm/msm_fbdev.c | 2 +-
> drivers/gpu/drm/msm/msm_gem.c | 494 +++---
> drivers/gpu/drm/msm/msm_gem.h | 247 ++-
> drivers/gpu/drm/msm/msm_gem_prime.c | 15 +
> drivers/gpu/drm/msm/msm_gem_shrinker.c | 104 +-
> drivers/gpu/drm/msm/msm_gem_submit.c | 295 ++--
> drivers/gpu/drm/msm/msm_gem_vma.c | 1471 ++++++++++++++++-
> drivers/gpu/drm/msm/msm_gpu.c | 214 ++-
> drivers/gpu/drm/msm/msm_gpu.h | 144 +-
> drivers/gpu/drm/msm/msm_gpu_trace.h | 14 +
> drivers/gpu/drm/msm/msm_iommu.c | 302 +++-
> drivers/gpu/drm/msm/msm_kms.c | 18 +-
> drivers/gpu/drm/msm/msm_kms.h | 2 +-
> drivers/gpu/drm/msm/msm_mmu.h | 38 +-
> drivers/gpu/drm/msm/msm_rd.c | 62 +-
> drivers/gpu/drm/msm/msm_ringbuffer.c | 10 +-
> drivers/gpu/drm/msm/msm_submitqueue.c | 96 +-
> drivers/gpu/drm/msm/msm_syncobj.c | 172 ++
> drivers/gpu/drm/msm/msm_syncobj.h | 37 +
> drivers/gpu/drm/scheduler/sched_entity.c | 16 +-
> drivers/gpu/drm/scheduler/sched_main.c | 3 +
> drivers/iommu/io-pgtable-arm.c | 27 +-
> include/drm/drm_gem.h | 10 +-
> include/drm/drm_gpuvm.h | 12 +-
> include/drm/gpu_scheduler.h | 13 +-
> include/linux/io-pgtable.h | 8 +
> include/uapi/drm/msm_drm.h | 149 +-
> 63 files changed, 3484 insertions(+), 1251 deletions(-)
> create mode 100644 drivers/gpu/drm/msm/msm_syncobj.c
> create mode 100644 drivers/gpu/drm/msm/msm_syncobj.h
>
> --
> 2.49.0
>
More information about the dri-devel
mailing list