[PATCH v4 00/40] drm/msm: sparse / "VM_BIND" support

Rob Clark robdclark at gmail.com
Wed May 14 17:13:22 UTC 2025


hmm, looks like git-send-email died with a TLS error a quarter of the
way thru this series.. I'll try to resend later

BR,
-R

On Wed, May 14, 2025 at 10:03 AM Rob Clark <robdclark at gmail.com> wrote:
>
> From: Rob Clark <robdclark at chromium.org>
>
> Conversion to DRM GPU VA Manager[1], and adding support for Vulkan Sparse
> Memory[2] in the form of:
>
> 1. A new VM_BIND submitqueue type for executing VM MSM_SUBMIT_BO_OP_MAP/
>    MAP_NULL/UNMAP commands
>
> 2. A new VM_BIND ioctl to allow submitting batches of one or more
>    MAP/MAP_NULL/UNMAP commands to a VM_BIND submitqueue
>
> I did not implement support for synchronous VM_BIND commands.  Since
> userspace could just immediately wait for the `SUBMIT` to complete, I don't
> think we need this extra complexity in the kernel.  Synchronous/immediate
> VM_BIND operations could be implemented with a 2nd VM_BIND submitqueue.
>
> The corresponding mesa MR: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32533
>
> Changes in v4:
> - Various locking/etc fixes
> - Optimize the pgtable preallocation.  If userspace sorts the VM_BIND ops
>   then the kernel detects ops that fall into the same 2MB last level PTD
>   to avoid duplicate page preallocation.
> - Add way to throttle pushing jobs to the scheduler, to cap the amount of
>   potentially temporary prealloc'd pgtable pages.
> - Add vm_log to devcoredump for debugging.  If the vm_log_shift module
>   param is set, keep a log of the last 1<<vm_log_shift VM updates for
>   easier debugging of faults/crashes.
> - Link to v3: https://lore.kernel.org/all/20250428205619.227835-1-robdclark@gmail.com/
>
> Changes in v3:
> - Switched to seperate VM_BIND ioctl.  This makes the UABI a bit
>   cleaner, but OTOH the userspace code was cleaner when the end result
>   of either type of VkQueue lead to the same ioctl.  So I'm a bit on
>   the fence.
> - Switched to doing the gpuvm bookkeeping synchronously, and only
>   deferring the pgtable updates.  This avoids needing to hold any resv
>   locks in the fence signaling path, resolving the last shrinker related
>   lockdep complaints.  OTOH it means userspace can trigger invalid
>   pgtable updates with multiple VM_BIND queues.  In this case, we ensure
>   that unmaps happen completely (to prevent userspace from using this to
>   access free'd pages), mark the context as unusable, and move on with
>   life.
> - Link to v2: https://lore.kernel.org/all/20250319145425.51935-1-robdclark@gmail.com/
>
> Changes in v2:
> - Dropped Bibek Kumar Patro's arm-smmu patches[3], which have since been
>   merged.
> - Pre-allocate all the things, and drop HACK patch which disabled shrinker.
>   This includes ensuring that vm_bo objects are allocated up front, pre-
>   allocating VMA objects, and pre-allocating pages used for pgtable updates.
>   The latter utilizes io_pgtable_cfg callbacks for pgtable alloc/free, that
>   were initially added for panthor.
> - Add back support for BO dumping for devcoredump.
> - Link to v1 (RFC): https://lore.kernel.org/dri-devel/20241207161651.410556-1-robdclark@gmail.com/T/#t
>
> [1] https://www.kernel.org/doc/html/next/gpu/drm-mm.html#drm-gpuvm
> [2] https://docs.vulkan.org/spec/latest/chapters/sparsemem.html
> [3] https://patchwork.kernel.org/project/linux-arm-kernel/list/?series=909700
>
> Rob Clark (40):
>   drm/gpuvm: Don't require obj lock in destructor path
>   drm/gpuvm: Allow VAs to hold soft reference to BOs
>   drm/gem: Add ww_acquire_ctx support to drm_gem_lru_scan()
>   drm/sched: Add enqueue credit limit
>   iommu/io-pgtable-arm: Add quirk to quiet WARN_ON()
>   drm/msm: Rename msm_file_private -> msm_context
>   drm/msm: Improve msm_context comments
>   drm/msm: Rename msm_gem_address_space -> msm_gem_vm
>   drm/msm: Remove vram carveout support
>   drm/msm: Collapse vma allocation and initialization
>   drm/msm: Collapse vma close and delete
>   drm/msm: Don't close VMAs on purge
>   drm/msm: drm_gpuvm conversion
>   drm/msm: Convert vm locking
>   drm/msm: Use drm_gpuvm types more
>   drm/msm: Split out helper to get iommu prot flags
>   drm/msm: Add mmu support for non-zero offset
>   drm/msm: Add PRR support
>   drm/msm: Rename msm_gem_vma_purge() -> _unmap()
>   drm/msm: Drop queued submits on lastclose()
>   drm/msm: Lazily create context VM
>   drm/msm: Add opt-in for VM_BIND
>   drm/msm: Mark VM as unusable on GPU hangs
>   drm/msm: Add _NO_SHARE flag
>   drm/msm: Crashdump prep for sparse mappings
>   drm/msm: rd dumping prep for sparse mappings
>   drm/msm: Crashdec support for sparse
>   drm/msm: rd dumping support for sparse
>   drm/msm: Extract out syncobj helpers
>   drm/msm: Use DMA_RESV_USAGE_BOOKKEEP/KERNEL
>   drm/msm: Add VM_BIND submitqueue
>   drm/msm: Support IO_PGTABLE_QUIRK_NO_WARN_ON
>   drm/msm: Support pgtable preallocation
>   drm/msm: Split out map/unmap ops
>   drm/msm: Add VM_BIND ioctl
>   drm/msm: Add VM logging for VM_BIND updates
>   drm/msm: Add VMA unmap reason
>   drm/msm: Add mmu prealloc tracepoint
>   drm/msm: use trylock for debugfs
>   drm/msm: Bump UAPI version
>
>  drivers/gpu/drm/drm_gem.c                     |   14 +-
>  drivers/gpu/drm/drm_gpuvm.c                   |   15 +-
>  drivers/gpu/drm/msm/Kconfig                   |    1 +
>  drivers/gpu/drm/msm/Makefile                  |    1 +
>  drivers/gpu/drm/msm/adreno/a2xx_gpu.c         |   25 +-
>  drivers/gpu/drm/msm/adreno/a2xx_gpummu.c      |    5 +-
>  drivers/gpu/drm/msm/adreno/a3xx_gpu.c         |   17 +-
>  drivers/gpu/drm/msm/adreno/a4xx_gpu.c         |   17 +-
>  drivers/gpu/drm/msm/adreno/a5xx_debugfs.c     |    4 +-
>  drivers/gpu/drm/msm/adreno/a5xx_gpu.c         |   22 +-
>  drivers/gpu/drm/msm/adreno/a5xx_power.c       |    2 +-
>  drivers/gpu/drm/msm/adreno/a5xx_preempt.c     |   10 +-
>  drivers/gpu/drm/msm/adreno/a6xx_gmu.c         |   32 +-
>  drivers/gpu/drm/msm/adreno/a6xx_gmu.h         |    2 +-
>  drivers/gpu/drm/msm/adreno/a6xx_gpu.c         |   49 +-
>  drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c   |    6 +-
>  drivers/gpu/drm/msm/adreno/a6xx_preempt.c     |   10 +-
>  drivers/gpu/drm/msm/adreno/adreno_device.c    |    4 -
>  drivers/gpu/drm/msm/adreno/adreno_gpu.c       |   99 +-
>  drivers/gpu/drm/msm/adreno/adreno_gpu.h       |   23 +-
>  .../drm/msm/disp/dpu1/dpu_encoder_phys_wb.c   |   14 +-
>  drivers/gpu/drm/msm/disp/dpu1/dpu_formats.c   |   18 +-
>  drivers/gpu/drm/msm/disp/dpu1/dpu_formats.h   |    2 +-
>  drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c       |   18 +-
>  drivers/gpu/drm/msm/disp/dpu1/dpu_plane.c     |   14 +-
>  drivers/gpu/drm/msm/disp/dpu1/dpu_plane.h     |    4 +-
>  drivers/gpu/drm/msm/disp/mdp4/mdp4_crtc.c     |    6 +-
>  drivers/gpu/drm/msm/disp/mdp4/mdp4_kms.c      |   28 +-
>  drivers/gpu/drm/msm/disp/mdp4/mdp4_plane.c    |   12 +-
>  drivers/gpu/drm/msm/disp/mdp5/mdp5_crtc.c     |    4 +-
>  drivers/gpu/drm/msm/disp/mdp5/mdp5_kms.c      |   19 +-
>  drivers/gpu/drm/msm/disp/mdp5/mdp5_plane.c    |   12 +-
>  drivers/gpu/drm/msm/dsi/dsi_host.c            |   14 +-
>  drivers/gpu/drm/msm/msm_drv.c                 |  184 +--
>  drivers/gpu/drm/msm/msm_drv.h                 |   35 +-
>  drivers/gpu/drm/msm/msm_fb.c                  |   18 +-
>  drivers/gpu/drm/msm/msm_fbdev.c               |    2 +-
>  drivers/gpu/drm/msm/msm_gem.c                 |  494 +++---
>  drivers/gpu/drm/msm/msm_gem.h                 |  247 ++-
>  drivers/gpu/drm/msm/msm_gem_prime.c           |   15 +
>  drivers/gpu/drm/msm/msm_gem_shrinker.c        |  104 +-
>  drivers/gpu/drm/msm/msm_gem_submit.c          |  295 ++--
>  drivers/gpu/drm/msm/msm_gem_vma.c             | 1471 ++++++++++++++++-
>  drivers/gpu/drm/msm/msm_gpu.c                 |  214 ++-
>  drivers/gpu/drm/msm/msm_gpu.h                 |  144 +-
>  drivers/gpu/drm/msm/msm_gpu_trace.h           |   14 +
>  drivers/gpu/drm/msm/msm_iommu.c               |  302 +++-
>  drivers/gpu/drm/msm/msm_kms.c                 |   18 +-
>  drivers/gpu/drm/msm/msm_kms.h                 |    2 +-
>  drivers/gpu/drm/msm/msm_mmu.h                 |   38 +-
>  drivers/gpu/drm/msm/msm_rd.c                  |   62 +-
>  drivers/gpu/drm/msm/msm_ringbuffer.c          |   10 +-
>  drivers/gpu/drm/msm/msm_submitqueue.c         |   96 +-
>  drivers/gpu/drm/msm/msm_syncobj.c             |  172 ++
>  drivers/gpu/drm/msm/msm_syncobj.h             |   37 +
>  drivers/gpu/drm/scheduler/sched_entity.c      |   16 +-
>  drivers/gpu/drm/scheduler/sched_main.c        |    3 +
>  drivers/iommu/io-pgtable-arm.c                |   27 +-
>  include/drm/drm_gem.h                         |   10 +-
>  include/drm/drm_gpuvm.h                       |   12 +-
>  include/drm/gpu_scheduler.h                   |   13 +-
>  include/linux/io-pgtable.h                    |    8 +
>  include/uapi/drm/msm_drm.h                    |  149 +-
>  63 files changed, 3484 insertions(+), 1251 deletions(-)
>  create mode 100644 drivers/gpu/drm/msm/msm_syncobj.c
>  create mode 100644 drivers/gpu/drm/msm/msm_syncobj.h
>
> --
> 2.49.0
>


More information about the dri-devel mailing list