[RFC v1 0/9] Parallel submission of dma fence jobs and LR jobs with shared hardware resources
Francois Dugast
francois.dugast at intel.com
Wed Jul 17 13:07:21 UTC 2024
Currently Xe KMD only allows either all VMs on the device to be page-faulting
VMs, or none of them to be page-faulting VMs. This prevents page-faulting
workloads from waiting for a dma-fence in the fault handler, as the page fault
would then hold the execution resources, which means the dma-fence would never
signal and this would create a deadlock.
This limitation in the driver prevents mixing dma-fence jobs and long-running
faulting jobs, for example if an application would submit 3D jobs for the
compositor but also SVM compute jobs on the same device. To safely lift this
restriction, a finer approach is introduced in this series.
Hardware engines which share resources and would block each other are assigned
to the same hardware engine group. This group ensures mutual exclusion of the
execution of dma fence jobs and long running jobs on the shared hardware
resources.
If a long running job is executing when a dma fence job is submitted, the long
running job is preempted, the dma fence job executes, then the long running
job is resumed. If a dma fence job is executing when a long running job is
submitted, we wait for completion of the dma fence job before executing the
long running job.
This has been tested on PVC with new IGT tests [1].
[1] https://patchwork.freedesktop.org/series/136191/
Francois Dugast (9):
drm/xe/hw_engine_group: Introduce xe_hw_engine_group
drm/xe/exec_queue: Add list link for the hw engine group
drm/xe/hw_engine_group: Register hw engine group's exec queues
drm/xe/hw_engine_group: Add helper to suspend LR jobs
drm/xe/hw_engine_group: Add helper to wait for dma fence jobs
drm/xe/hw_engine_group: Ensure safe transition between execution modes
drm/xe/exec: Switch hw engine group execution mode upon job submission
drm/xe/hw_engine_group: Resume LR exec queues suspended by dma fence
jobs
drm/xe/vm: Remove restriction that all VMs must be faulting if one is
drivers/gpu/drm/xe/xe_device.h | 10 -
drivers/gpu/drm/xe/xe_exec.c | 14 +-
drivers/gpu/drm/xe/xe_exec_queue.c | 7 +
drivers/gpu/drm/xe/xe_exec_queue_types.h | 2 +
drivers/gpu/drm/xe/xe_hw_engine.c | 256 +++++++++++++++++++++++
drivers/gpu/drm/xe/xe_hw_engine.h | 9 +
drivers/gpu/drm/xe/xe_hw_engine_types.h | 31 +++
drivers/gpu/drm/xe/xe_vm.c | 8 -
8 files changed, 318 insertions(+), 19 deletions(-)
--
2.43.0
More information about the Intel-xe
mailing list