[Intel-xe] [PATCH 00/14] Introduce xe_devcoredump.
Matthew Brost
matthew.brost at intel.com
Tue May 2 08:11:32 UTC 2023
On Wed, Apr 26, 2023 at 04:56:59PM -0400, Rodrigo Vivi wrote:
> Xe needs to align with other drivers on the way that the error states are
> dumped, avoiding a Xe only error_state solution. The goal is to use devcoredump
> infrastructure to report error states, since it produces a standardized way
> by exposing a virtual and temporary /sys/class/devcoredump device.
>
> The initial goal is to have the simple_error_state in the devcoredump
> so we start using the infrastructure.
>
> But this is just a start point to start building a useful and
> organized crash dump, using standard infrastructure. Later this
> will be changed to have output that can be parsed by tools and
> used for error replay.
We are certainly missing the GuC log, it would also be really nice to
get the ftrace included too. Not sure if the later is easy, I know I
looked into this on the i915 and couldn't figure it out but this was a
while ago and admittedly didn't try all that hard.
Matt
>
> Later, when we are in-tree, the goal is to collaborate with devcoredump
> infrastructure with overall possible improvements, like multiple file support
> for better organization of the dumps, snapshot support, dmesg extra print,
> and whatever may make sense and help the overall infrastructure.
>
> Cc: Daniel Vetter <daniel.vetter at ffwll.ch>
> Signed-off-by: Rodrigo Vivi <rodrigo.vivi at intel.com>
>
> Rodrigo Vivi (14):
> drm/xe: Fix print of RING_EXECLIST_SQ_CONTENTS_HI
> drm/xe: Introduce the dev_coredump infrastructure.
> drm/xe: Do not take any action if our device was removed.
> drm/xe: Extract non mapped regions out of GuC CTB into its own struct.
> drm/xe: Convert GuC CT print to snapshot capture and print.
> drm/xe: Add GuC CT snapshot to xe_devcoredump.
> drm/xe: Introduce guc_submit_types.h with relevant structs.
> drm/xe: Convert GuC Engine print to snapshot capture and print.
> drm/xe: Add GuC Submit Engine snapshot to xe_devcoredump.
> drm/xe: Convert Xe HW Engine print to snapshot capture and print.
> drm/xe: Add HW Engine snapshot to xe_devcoredump.
> drm/xe: Limit CONFIG_DRM_XE_SIMPLE_ERROR_CAPTURE to itself.
> drm/xe: Convert VM print to snapshot capture and print.
> drm/xe: Add VM snapshot to xe_devcoredump.
>
> drivers/gpu/drm/xe/Kconfig | 1 +
> drivers/gpu/drm/xe/Makefile | 1 +
> drivers/gpu/drm/xe/regs/xe_engine_regs.h | 3 +-
> drivers/gpu/drm/xe/xe_devcoredump.c | 227 ++++++++++++++++++
> drivers/gpu/drm/xe/xe_devcoredump.h | 22 ++
> drivers/gpu/drm/xe/xe_devcoredump_types.h | 60 +++++
> drivers/gpu/drm/xe/xe_device_types.h | 4 +
> drivers/gpu/drm/xe/xe_execlist.c | 4 +-
> drivers/gpu/drm/xe/xe_gt_debugfs.c | 2 +-
> drivers/gpu/drm/xe/xe_guc_ct.c | 275 +++++++++++++++-------
> drivers/gpu/drm/xe/xe_guc_ct.h | 7 +-
> drivers/gpu/drm/xe/xe_guc_ct_types.h | 46 +++-
> drivers/gpu/drm/xe/xe_guc_fwif.h | 29 ---
> drivers/gpu/drm/xe/xe_guc_submit.c | 258 ++++++++++++++------
> drivers/gpu/drm/xe/xe_guc_submit.h | 10 +-
> drivers/gpu/drm/xe/xe_guc_submit_types.h | 155 ++++++++++++
> drivers/gpu/drm/xe/xe_hw_engine.c | 210 ++++++++++++-----
> drivers/gpu/drm/xe/xe_hw_engine.h | 8 +-
> drivers/gpu/drm/xe/xe_hw_engine_types.h | 78 ++++++
> drivers/gpu/drm/xe/xe_pci.c | 2 +
> drivers/gpu/drm/xe/xe_vm.c | 140 +++++++++--
> drivers/gpu/drm/xe/xe_vm.h | 6 +-
> drivers/gpu/drm/xe/xe_vm_types.h | 18 ++
> 23 files changed, 1288 insertions(+), 278 deletions(-)
> create mode 100644 drivers/gpu/drm/xe/xe_devcoredump.c
> create mode 100644 drivers/gpu/drm/xe/xe_devcoredump.h
> create mode 100644 drivers/gpu/drm/xe/xe_devcoredump_types.h
> create mode 100644 drivers/gpu/drm/xe/xe_guc_submit_types.h
>
> --
> 2.39.2
More information about the Intel-xe
mailing list