[PATCH 00/15] Intel Xe GPU Debug Support (eudebug) v4
Mika Kuoppala
mika.kuoppala at linux.intel.com
Fri Aug 8 10:43:34 UTC 2025
Hi,
This is the v4 patch series for Intel Xe GPU debug support (eudebug).
This series continues from the following previous submissions:
- v1: https://lists.freedesktop.org/archives/intel-xe/2024-July/043605.html
- v2: https://lists.freedesktop.org/archives/intel-xe/2024-October/052260.html
- v3: https://lists.freedesktop.org/archives/intel-xe/2024-December/061476.html
This is a major cleanup and rework of eudebug patch series to address the
feedback for v3. Page fault handling is omitted for until we receive
ack on core design, as there was no feedback on it on previous iterations.
### Major Changes
#### 1. Elimination of ptrace_may_access() and pid
In previous series, the connection attempt was made using the process ID
(PID) as the target. Access was checked using the `ptrace_may_access()`
helper to achieve security parity with CPU-side debugging.
In v4, this has been changed to connect to a DRM client, using a file
descriptor as the target. This approach eliminates the need for the
`ptrace_may_access()` symbol export, as access control is now managed
through the debugger process's access to the file descriptor. For example,
accessing a remote DRM client requires the debugger process to
successfully call `pidfd_getfd()` to obtain a duplicate of the target
file descriptor.The 1:1 mapping between DRM clients and their debuggers
eliminates the need for `EVENT_OPEN` and simplifies overall connection
tracking.
#### 2. ELF binaries not held in kernel memory
In v4, debug data is delivered as a VM bind 'OP_ADD_DEBUG_DATA' extension.
The ELF binaries are no longer stored within the Xe KMD but are instead
kept in a file. The file path is passed as part of an extension in
the newly introduced 'OP_ADD_DEBUG_DATA' VM bind operation. Alternatively
pseudo-paths can be used to annotate special address ranges similar to
/proc/<pid>/maps.
#### 3. Debug metadata not carried in VMA struct
Instead of attaching debug data to vma created by 'OP_MAP',
we introduce separate ops for managing the metadata.
Debug data is no longer held in the VMA struct. xe_vm contains a
list of all associated debug data.
### Supported Hardware with v4
- Lunarlake (LNL)
- Battlemage (BMG)
- Pantherlake (PTL)
The code for this submission can be found at:
https://gitlab.freedesktop.org/miku/kernel/-/tree/eudebug-v4
Christoph Manszewski (5):
drm/xe: Introduce ADD_DEBUG_DATA and REMOVE_DEBUG_DATA vm bind ops
drm/xe/eudebug: Introduce vm bind and vm bind debug data events
drm/xe/eudebug_test: Introduce xe_eudebug wa kunit test
drm/xe: Implement SR-IOV and eudebug exclusivity
drm/xe: Add xe_client_debugfs and introduce debug_data file
Dominik Grzegorzek (5):
drm/xe/eudebug: Introduce exec_queue events
drm/xe: Add EUDEBUG_ENABLE exec queue property
drm/xe/eudebug: hw enablement for eudebug
drm/xe/eudebug: Introduce EU control interface
drm/xe/eudebug: Introduce per device attention scan worker
Mika Kuoppala (5):
drm/xe/eudebug: Introduce eudebug interface
drm/xe/eudebug: Introduce discovery for resources
drm/xe/eudebug: Add UFENCE events with acks
drm/xe/eudebug: vm open/pread/pwrite
drm/xe/eudebug: userptr vm pread/pwrite
drivers/gpu/drm/xe/Kconfig | 10 +
drivers/gpu/drm/xe/Makefile | 7 +-
drivers/gpu/drm/xe/regs/xe_engine_regs.h | 7 +
drivers/gpu/drm/xe/regs/xe_gt_regs.h | 43 +
drivers/gpu/drm/xe/tests/xe_eudebug.c | 189 ++
drivers/gpu/drm/xe/tests/xe_live_test_mod.c | 5 +
drivers/gpu/drm/xe/xe_client_debugfs.c | 118 +
drivers/gpu/drm/xe/xe_client_debugfs.h | 19 +
drivers/gpu/drm/xe/xe_debug_data.c | 279 +++
drivers/gpu/drm/xe/xe_debug_data.h | 22 +
drivers/gpu/drm/xe/xe_debug_data_types.h | 25 +
drivers/gpu/drm/xe/xe_device.c | 30 +-
drivers/gpu/drm/xe/xe_device.h | 42 +
drivers/gpu/drm/xe/xe_device_types.h | 40 +
drivers/gpu/drm/xe/xe_eudebug.c | 2309 +++++++++++++++++++
drivers/gpu/drm/xe/xe_eudebug.h | 116 +
drivers/gpu/drm/xe/xe_eudebug_hw.c | 730 ++++++
drivers/gpu/drm/xe/xe_eudebug_hw.h | 32 +
drivers/gpu/drm/xe/xe_eudebug_types.h | 174 ++
drivers/gpu/drm/xe/xe_eudebug_vm.c | 434 ++++
drivers/gpu/drm/xe/xe_eudebug_vm.h | 8 +
drivers/gpu/drm/xe/xe_exec.c | 2 +-
drivers/gpu/drm/xe/xe_exec_queue.c | 51 +-
drivers/gpu/drm/xe/xe_exec_queue.h | 2 +
drivers/gpu/drm/xe/xe_exec_queue_types.h | 7 +
drivers/gpu/drm/xe/xe_gt.c | 1 +
drivers/gpu/drm/xe/xe_gt_debug.c | 179 ++
drivers/gpu/drm/xe/xe_gt_debug.h | 41 +
drivers/gpu/drm/xe/xe_hw_engine.h | 14 +
drivers/gpu/drm/xe/xe_lrc.c | 10 +
drivers/gpu/drm/xe/xe_oa.c | 3 +-
drivers/gpu/drm/xe/xe_pci_sriov.c | 10 +
drivers/gpu/drm/xe/xe_reg_sr.c | 21 +-
drivers/gpu/drm/xe/xe_reg_sr.h | 4 +-
drivers/gpu/drm/xe/xe_reg_whitelist.c | 2 +-
drivers/gpu/drm/xe/xe_rtp.c | 2 +-
drivers/gpu/drm/xe/xe_sync.c | 45 +-
drivers/gpu/drm/xe/xe_sync.h | 8 +-
drivers/gpu/drm/xe/xe_sync_types.h | 28 +-
drivers/gpu/drm/xe/xe_vm.c | 186 +-
drivers/gpu/drm/xe/xe_vm.h | 26 +
drivers/gpu/drm/xe/xe_vm_types.h | 38 +
drivers/gpu/drm/xe/xe_wa_oob.rules | 4 +
include/uapi/drm/xe_drm.h | 59 +
include/uapi/drm/xe_drm_eudebug.h | 217 ++
45 files changed, 5552 insertions(+), 47 deletions(-)
create mode 100644 drivers/gpu/drm/xe/tests/xe_eudebug.c
create mode 100644 drivers/gpu/drm/xe/xe_client_debugfs.c
create mode 100644 drivers/gpu/drm/xe/xe_client_debugfs.h
create mode 100644 drivers/gpu/drm/xe/xe_debug_data.c
create mode 100644 drivers/gpu/drm/xe/xe_debug_data.h
create mode 100644 drivers/gpu/drm/xe/xe_debug_data_types.h
create mode 100644 drivers/gpu/drm/xe/xe_eudebug.c
create mode 100644 drivers/gpu/drm/xe/xe_eudebug.h
create mode 100644 drivers/gpu/drm/xe/xe_eudebug_hw.c
create mode 100644 drivers/gpu/drm/xe/xe_eudebug_hw.h
create mode 100644 drivers/gpu/drm/xe/xe_eudebug_types.h
create mode 100644 drivers/gpu/drm/xe/xe_eudebug_vm.c
create mode 100644 drivers/gpu/drm/xe/xe_eudebug_vm.h
create mode 100644 drivers/gpu/drm/xe/xe_gt_debug.c
create mode 100644 drivers/gpu/drm/xe/xe_gt_debug.h
create mode 100644 include/uapi/drm/xe_drm_eudebug.h
--
2.43.0
More information about the Intel-xe
mailing list