[RFC PATCH 00/29] UMD direct submission in Xe
Matthew Brost
matthew.brost at intel.com
Mon Nov 18 23:37:28 UTC 2024
This is an RFC, or possibly even a proof of concept, for UMD (User Mode
Driver) direct submission in Xe. It is similar to AMD's design [1] [2]
or ARM's design [3], utilizing a uAPI to convert user-space syncs
(memory writes) to kernel-space syncs (DMA fences). It is built around
the existing Xe preemption fences for dynamic memory management, such as
userptr invalidation and buffer object (BO) eviction.
The series also enables mapping a PPGTT-bound submission ring in
non-privileged mode, as well as exposing indirect ring state (such as
ring head, tail, etc.) and the doorbell to user space, enabling UMD
direct submission.
The target for this series is Mesa, with the goal of enabling UMD direct
submission and removing the submission thread that currently handles
future fences. I've discussed this with Sima and the Intel Mesa team,
and it seems like a reachable target. Most synchronization will be
handled in user space via memory writes and semaphore wait ring
instructions, with only legacy cross-process synchronization (e.g.,
compositors) requiring kernel synchronization (DMA fences).
The series includes some common patches at the beginning to implement
preemption fences and user fences. The idea of preemption
DMA-reservation slots [4] has been dropped in favor of attaching the
last exported DMA fence to the preemption fence as suggested by AMD.
This is a public checkpoint on the KMD (Kernel Mode Driver) work, which
will be tabled until Intel's Mesa team has the bandwidth to begin the
UMD work. That said, the uAPI is very preliminary and likely to change.
One idea that was discussed is a common user fence interface based
around DRM syncobjs, which will likely be explored further as UMD
engagement begins. Some work for syncing VM binds (kernel operation)
with UMD direct submission is also likely required.
Testing has been done with [5], and the main features—such as basic
submission, dynamic memory management, user-to-kernel sync conversion,
and protection against endless user fences—are working on BMG and LNL.
The GitLab branch [6] has also been pushed for reference.
Any early community feedback is always appreciated.
Matt
[1] https://patchwork.freedesktop.org/series/113675/
[2] https://patchwork.freedesktop.org/series/114385/
[3] https://patchwork.freedesktop.org/series/137924/
[4] https://patchwork.freedesktop.org/series/141129/
[5] https://patchwork.freedesktop.org/series/141518/
[6] https://gitlab.freedesktop.org/mbrost/xe-kernel-driver-umd-submission-post/-/tree/post-11-18-24?ref_type=heads
Matthew Brost (28):
dma-fence: Add dma_fence_preempt base class
dma-fence: Add dma_fence_user_fence
drm/xe: Use dma_fence_preempt base class
drm/xe: Allocate doorbells for UMD exec queues
drm/xe: Add doorbell ID to snapshot capture
drm/xe: Break submission ring out into its own BO
drm/xe: Break indirect ring state out into its own BO
drm/xe: Clear GGTT in xe_bo_restore_kernel
FIXME: drm/xe: Add pad to ring and indirect state
drm/xe: Enable indirect ring on media GT
drm/xe: Don't add pinned mappings to VM bulk move
drm/xe: Add exec queue post init extension processing
drm/xe: Add support for mmapping doorbells to user space
drm/xe: Add support for mmapping submission ring and indirect ring
state to user space
drm/xe/uapi: Define UMD exec queue mapping uAPI
drm/xe: Add usermap exec queue extension
drm/xe: Drop EXEC_QUEUE_FLAG_UMD_SUBMISSION flag
drm/xe: Do not allow usermap exec queues in exec IOCTL
drm/xe: Teach GuC backend to kill usermap queues
drm/xe: Enable preempt fences on usermap queues
drm/xe/uapi: Add uAPI to convert user semaphore to / from drm syncobj
drm/xe: Add user fence IRQ handler
drm/xe: Add xe_hw_fence_user_init
drm/xe: Add a message lock to the Xe GPU scheduler
drm/xe: Always wait on preempt fences in vma_check_userptr
drm/xe: Teach xe_sync layer about drm_xe_semaphore
drm/xe: Add VM convert fence IOCTL
drm/xe: Add user fence TDR
Tejas Upadhyay (1):
drm/xe/mmap: Add mmap support for PCI memory barrier
drivers/dma-buf/Makefile | 2 +-
drivers/dma-buf/dma-fence-preempt.c | 134 ++++++
drivers/dma-buf/dma-fence-user-fence.c | 73 ++++
drivers/gpu/drm/xe/xe_bo.c | 29 +-
drivers/gpu/drm/xe/xe_bo.h | 5 +
drivers/gpu/drm/xe/xe_bo_evict.c | 8 +-
drivers/gpu/drm/xe/xe_device.c | 181 +++++++-
drivers/gpu/drm/xe/xe_device_types.h | 3 +
drivers/gpu/drm/xe/xe_exec.c | 3 +-
drivers/gpu/drm/xe/xe_exec_queue.c | 175 +++++++-
drivers/gpu/drm/xe/xe_exec_queue.h | 5 +
drivers/gpu/drm/xe/xe_exec_queue_types.h | 13 +
drivers/gpu/drm/xe/xe_execlist.c | 2 +-
drivers/gpu/drm/xe/xe_ggtt.c | 19 +-
drivers/gpu/drm/xe/xe_ggtt.h | 2 +
drivers/gpu/drm/xe/xe_gpu_scheduler.c | 19 +-
drivers/gpu/drm/xe/xe_gpu_scheduler.h | 12 +-
drivers/gpu/drm/xe/xe_gpu_scheduler_types.h | 2 +
drivers/gpu/drm/xe/xe_guc_exec_queue_types.h | 9 +-
drivers/gpu/drm/xe/xe_guc_submit.c | 177 +++++++-
drivers/gpu/drm/xe/xe_guc_submit_types.h | 2 +
drivers/gpu/drm/xe/xe_hw_engine.c | 4 +-
drivers/gpu/drm/xe/xe_hw_engine_group.c | 4 +-
drivers/gpu/drm/xe/xe_hw_fence.c | 17 +
drivers/gpu/drm/xe/xe_hw_fence.h | 3 +
drivers/gpu/drm/xe/xe_lrc.c | 176 ++++++--
drivers/gpu/drm/xe/xe_lrc.h | 4 +-
drivers/gpu/drm/xe/xe_lrc_types.h | 16 +-
drivers/gpu/drm/xe/xe_pci.c | 1 +
drivers/gpu/drm/xe/xe_preempt_fence.c | 89 ++--
drivers/gpu/drm/xe/xe_preempt_fence.h | 2 +-
drivers/gpu/drm/xe/xe_preempt_fence_types.h | 11 +-
drivers/gpu/drm/xe/xe_pt.c | 5 +-
drivers/gpu/drm/xe/xe_sync.c | 90 ++++
drivers/gpu/drm/xe/xe_sync.h | 8 +
drivers/gpu/drm/xe/xe_sync_types.h | 5 +-
drivers/gpu/drm/xe/xe_vm.c | 423 ++++++++++++++++++-
drivers/gpu/drm/xe/xe_vm.h | 4 +-
drivers/gpu/drm/xe/xe_vm_types.h | 26 ++
include/linux/dma-fence-preempt.h | 56 +++
include/linux/dma-fence-user-fence.h | 31 ++
include/uapi/drm/xe_drm.h | 147 ++++++-
42 files changed, 1798 insertions(+), 199 deletions(-)
create mode 100644 drivers/dma-buf/dma-fence-preempt.c
create mode 100644 drivers/dma-buf/dma-fence-user-fence.c
create mode 100644 include/linux/dma-fence-preempt.h
create mode 100644 include/linux/dma-fence-user-fence.h
--
2.34.1
More information about the dri-devel
mailing list