[v8 PATCH 00/13] drm/msm: Capture and dump the GPU crash state
Jordan Crouse
jcrouse at codeaurora.org
Tue Jul 24 16:33:18 UTC 2018
This is revision 8 implementing a GPU crash state for drm/msm
(https://patchwork.freedesktop.org/series/36097/). This patchset adds better
documentation and reflects comments from the mailing lists. I know we will
miss 4.19 at this point, but I think this is ready to soak in msm-next for
a while.
The object of this code is to store and provide enough information to debug
software and hardware issues on the Adreno hardware in a semi human-readable
format that can also be parsed by scripts.
THe full set of changes here capture basic information about the GPU, the
status and contents of the ringbuffers, a snapshot of the current register state
and the active buffers from the hanging submit.
The data is printed with devcoredump. For example, after a hang you can get
the data from /sys/class/devcoredump/devcdX/data where X is a unique number.
v8: Add documentation and consolidate puts/printf code from code comments
v7: Add EXPORT_SYMBOL for __drm_puts_coredump and use %zd to print a size_t
variable for the bo dump thanks to the ever vigilant zero one bot.
v6: Add drm_puts() and use it in the appropriate place. Clean up a few minor
bugs here and there.
v5: Fix symbol error in i915_gpu_error.c thanks to 01 dot org bot. Added
open/release functions for the show debugfs file to get the state per Chris
Wilson. Slightly modified the register output format to be more YAML friendly
also per Chris.
v4: Add buffer dump for the active submit. Fix refcount issue with devcoredump.
Change header for a5xx registers to registers-hlsq because I'm told YAML
requires unique tags.
v3: Make recommended changes to ascii85 per Chris Wilson. Use devcoredump to
dump crash states as suggested by Bjorn Andersson and add a new drm_print
facility to facilitate that. Remove the now obsolete 'crash' debugfs node.
Add documentation for the crash dump output.
v2: Convert output to yaml, use ascii85 to dump ringbuffer contents.
Jordan Crouse (13):
include: Move ascii85 functions from i915 to linux/ascii85.h
drm: drm_printer: Add printer for devcoredump
drm: Add drm_puts() to complement drm_printf()
drm: Add a -puts() function for the seq_file printer
drm: Add puts callback for the coredump printer
drm/msm/gpu: Capture the state of the GPU
drm/msm/gpu: Convert the GPU show function to use the GPU state
drm/msm/gpu: Rearrange the code that collects the task during a hang
drm/msm/gpu: Capture the GPU state on a GPU hang
drm/msm/adreno: Convert the show/crash file format
drm/msm/adreno: Add ringbuffer data to the GPU state
drm/msm/adreno: Add a5xx specific registers for the GPU state
drm/msm/gpu: Add the buffer objects from the submit to the crash dump
Documentation/gpu/msm-crash-dump.rst | 96 ++++++++++
drivers/gpu/drm/drm_print.c | 111 +++++++++++
drivers/gpu/drm/i915/i915_gpu_error.c | 34 +---
drivers/gpu/drm/msm/Kconfig | 1 +
drivers/gpu/drm/msm/adreno/a3xx_gpu.c | 30 +--
drivers/gpu/drm/msm/adreno/a4xx_gpu.c | 22 ++-
drivers/gpu/drm/msm/adreno/a5xx_gpu.c | 242 ++++++++++++++++++++++--
drivers/gpu/drm/msm/adreno/adreno_gpu.c | 184 ++++++++++++++++--
drivers/gpu/drm/msm/adreno/adreno_gpu.h | 10 +-
drivers/gpu/drm/msm/msm_debugfs.c | 93 ++++++++-
drivers/gpu/drm/msm/msm_gpu.c | 145 +++++++++++++-
drivers/gpu/drm/msm/msm_gpu.h | 68 ++++++-
include/drm/drm_print.h | 71 +++++++
include/linux/ascii85.h | 38 ++++
14 files changed, 1044 insertions(+), 101 deletions(-)
create mode 100644 Documentation/gpu/msm-crash-dump.rst
create mode 100644 include/linux/ascii85.h
--
2.18.0
More information about the dri-devel
mailing list