[PATCH v3 0/8] drm: Introduce sparse GEM shmem

Thu Apr 10 14:48:09 UTC 2025

+Christian, Alyssa and Faith, as suggested by Sima. I'll make sure to
Cc you on v4, but before that, I'd like to get your opinion on the
drm-gem/drm-gem-shmem changes to see if sending a v4 is actually
desirable or if I should go back to the drawing board.

On Fri,  4 Apr 2025 11:26:26 +0200
Boris Brezillon <boris.brezillon at collabora.com> wrote:

> This patch series is a proposal for implementing sparse page allocations
> for shmem objects. It was initially motivated by a kind of BO managed by
> the Panfrost/Panthor and Lima drivers, the tiler heap, which grows on
> demand every time the GPU faults on a virtual address within the heap BO
> range.
> 
> Because keeping a struct page pointer array that can describe the entire
> virtual range is wasteful when only a few backing pages have been
> allocated, we thought a sparse allocation approach with xarrays was a
> more efficient choice.
> 
> This sparse GEM shmem features takes the form of a new optional
> drm_gem_shmem_sparse_backing object that can be attached to a
> drm_gem_shmem_object at creation time if the driver wants. When this
> sparse GEM shmem backing mode is enabled, the driver is allow to
> partially populate the GEM pages, either at use time, or at fault
> time. The page population can be done progressively as the need for
> more memory arises. The new APIs takes explicit gfp_t flags to solve
> a long standing issue reported by Sima a while ago: all allocations
> happening in the fault handler path shouldn't block.
> 
> We also introduce two new helpers at the drm_gem.{c,h} level to
> automate the partial xarray population which the GEM-SHMEM logic
> relies on to populate its sparse page array.
> 
> A few details about the implementation now:
> 
> - Sparse GEM backing locking is deferred to the driver, because
>   we can't take the resv locks in the GPU fault handler path, and
>   since the page population can come from there, we have to let
>   the driver decide how to lock.
> - The pin/get semantics for sparse GEM shmem objects is different,
>   in that it doesn't populate the pages, but just flag them as
>   being used/required by some GPU component. The page population
>   will happen explicitly at GEM creation time or when a GPU fault
>   happens. Once pages have been populated, they won't disappear
>   until the GEM object is destroyed, purged or evicted. This means
>   you can grow, but not shrink. If we ever need to discard
>   BO ranges, the API can be extended to allow it, but it's not
>   something we needed for the Lima/Panthor/Panfrost case.
> - Panthor and Panfrost changes have been tested, but not extensively.
>   Lima changes have only been compile-tested.
> 
> Changes from v2:
> - We decided to focus more on the DRM aspects and forget about the
>   sgt building optimizations (sgt helpers taking an xarray instead of
>   a plain array). If the xarray -> array copies happening in that
>   path ever become the bottleneck, we can resurrect those changes.
> - We decided to make the sparse GEM changes less invasive by making
>   them an extra layer on top of drm_gem_shmem_object. What this means
>   is that sparse GEM shmem can now be used as regular GEM shmem
>   objects (vmap, pin, export, ... all work as they would on a regular
>   GEM). When that happens, we just force all pages to be populated,
>   so we kinda lose the benefit of sparse GEMs, but that's fine, because
>   in practice, such objects shouldn't be manipulated as regular GEMs.
>   It just serves as a safety guard to limit the risk of regressions
>   introduced by these sparse GEM shmem changes.
> 
> Discussion of previus revision can be found here[2][3].
> 
> For those used to review gitlab MRs, here's a link [1] to a Draft
> MR grouping those changes, but I'm in no way asking that the review
> happens on gitlab.
> 
> Regards,
> 
> Boris
> 
> [1]https://gitlab.freedesktop.org/panfrost/linux/-/merge_requests/16
> [2]https://lore.kernel.org/lkml/20250326021433.772196-1-adrian.larumbe@collabora.com/T/
> [3]https://lore.kernel.org/dri-devel/20250218232552.3450939-1-adrian.larumbe@collabora.com/
> 
> Adrián Larumbe (1):
>   drm/gem: Add helpers to request a range of pages on a GEM
> 
> Boris Brezillon (7):
>   drm/gem-shmem: Support sparse backing
>   drm/panfrost: Switch to sparse gem shmem to implement our
>     alloc-on-fault
>   drm/panthor: Add support for alloc-on-fault buffers
>   drm/panthor: Allow kernel BOs to pass DRM_PANTHOR_BO_ALLOC_ON_FAULT
>   drm/panthor: Add a panthor_vm_pre_fault_range() helper
>   drm/panthor: Make heap chunk allocation non-blocking
>   drm/lima: Use drm_gem_shmem_sparse_backing for heap buffers
> 
>  drivers/gpu/drm/drm_gem.c               | 134 +++++++
>  drivers/gpu/drm/drm_gem_shmem_helper.c  | 404 +++++++++++++++++++-
>  drivers/gpu/drm/lima/lima_gem.c         |  89 ++---
>  drivers/gpu/drm/lima/lima_gem.h         |   1 +
>  drivers/gpu/drm/lima/lima_vm.c          |  48 ++-
>  drivers/gpu/drm/panfrost/panfrost_drv.c |   2 +-
>  drivers/gpu/drm/panfrost/panfrost_gem.c |  37 +-
>  drivers/gpu/drm/panfrost/panfrost_gem.h |   8 +-
>  drivers/gpu/drm/panfrost/panfrost_mmu.c |  98 ++---
>  drivers/gpu/drm/panfrost/panfrost_mmu.h |   2 +
>  drivers/gpu/drm/panthor/panthor_drv.c   |  20 +-
>  drivers/gpu/drm/panthor/panthor_fw.c    |   6 +-
>  drivers/gpu/drm/panthor/panthor_gem.c   |  18 +-
>  drivers/gpu/drm/panthor/panthor_gem.h   |  12 +-
>  drivers/gpu/drm/panthor/panthor_heap.c  | 222 ++++++-----
>  drivers/gpu/drm/panthor/panthor_mmu.c   | 481 ++++++++++++++++++------
>  drivers/gpu/drm/panthor/panthor_mmu.h   |   2 +
>  drivers/gpu/drm/panthor/panthor_sched.c |   6 +-
>  include/drm/drm_gem.h                   |  14 +
>  include/drm/drm_gem_shmem_helper.h      | 285 +++++++++++++-
>  include/uapi/drm/panthor_drm.h          |  19 +-
>  21 files changed, 1492 insertions(+), 416 deletions(-)
>