[PATCH v3 0/9] Add support for Mesa GPU hang replay tool

Matthew Brost matthew.brost at intel.com
Thu Mar 20 19:28:22 UTC 2025


Add support for the Mesa GPU hang replay tool, which exists in the i915.

The main changes are as follows:

- Update devcoredump to include additional information, allowing the
  Mesa tool to extract everything it needs to replay a GPU hang. These
  updates are designed to remain compatible with the existing Mesa
  devcoredump parser.
- Introduce the DRM_XE_EXEC_QUEUE_SET_HANG_REPLAY_STATE extension, which
  enables setting the execution queue state to the hung execution queue
  state.

v2:
- Enable the flag DRM_XE_EXEC_QUEUE_SET_HANG_REPLAY_STATE
- Fix the page math to avoid a crash
v3:
- Add pat_index and cpu_caching to properties
- A VM.uapi_flags  

The Mesa uAPI tool development is a WIP. The tool is a prerequisite for
merging this change.

Matt

Matthew Brost (9):
  drm/xe: Add properties line to VM snapshot capture
  drm/xe: Add "null_sparse" type to VM snap properties
  drm/xe: Add mem_region to properties line in VM snapshot capture
  drm/xe: Add pat_index to properties line in VM snapshot capture
  drm/xe: Add cpu_caching to properties line in VM snapshot capture
  drm/xe: Add VM.uapi_flags to VM snapshot capture
  drm/xe/uapi: Add DRM_XE_EXEC_QUEUE_SET_HANG_REPLAY_STATE
  drm/xe: Add replay_offset and replay_length lines to LRC HWCTX
    snapshot
  drm/xe: Implement DRM_XE_EXEC_QUEUE_SET_HANG_REPLAY_STATE

 drivers/gpu/drm/xe/xe_exec_queue.c       | 32 +++++++++++++-
 drivers/gpu/drm/xe/xe_exec_queue_types.h |  3 ++
 drivers/gpu/drm/xe/xe_execlist.c         |  2 +-
 drivers/gpu/drm/xe/xe_lrc.c              | 44 +++++++++++++++----
 drivers/gpu/drm/xe/xe_lrc.h              |  4 +-
 drivers/gpu/drm/xe/xe_lrc_types.h        |  3 ++
 drivers/gpu/drm/xe/xe_vm.c               | 55 +++++++++++++++++++++++-
 include/uapi/drm/xe_drm.h                |  9 +++-
 8 files changed, 137 insertions(+), 15 deletions(-)

-- 
2.34.1



More information about the Intel-xe mailing list