[v4 00/10] drm/msm: GPU crash state

Jordan Crouse jcrouse at codeaurora.org
Thu Apr 5 22:00:46 UTC 2018


This is revision 4 implementing a GPU crash state for drm/msm
(https://patchwork.freedesktop.org/series/36097/).  I think its mature enough
to pull out of RFC status and think about merging.

The goal is to store and provide enough information to debug software
and hardware issues on the Adreno hardware in a semi human-readable
format that can also be parsed by scripts.

THe full set of changes here capture basic information about the GPU, the
status and contents of the ringbuffers, a snapshot of the current register state
and the active buffers from the hanging submit.

The data is printed with devcoredump.  For example, after a hang you can get
the data from /sys/class/devcoredump/devcdX/data where X is a unique number.

You can see an example of the output for a simple invalid opcode error on the
db820c here: https://hastebin.com/yivozimoki.bash

v4: Add buffer dump for the active submit. Fix refcount issue with devcoredump.
Change header for a5xx registers to registers-hlsq because I'm told YAML
requires unique tags.
v3: Make recommended changes to ascii85 per Chris Wilson. Use devcoredump to
dump crash states as suggested by Bjorn Andersson and add a new drm_print
facility to facilitate that. Remove the now obsolete 'crash' debugfs node.
Add documentation for the crash dump output.

v2: Convert output to yaml, use ascii85 to dump ringbuffer contents.

Jordan Crouse (10):
  include: Move ascii85 functions from i915 to linux/ascii85.h
  drm: drm_printer: Add printer for devcoredump
  drm/msm/gpu: Capture the state of the GPU
  drm/msm/gpu: Convert the GPU show function to use the GPU state
  drm/msm/gpu: Rearrange the code that collects the task during a hang
  drm/msm/gpu: Capture the GPU state on a GPU hang
  drm/msm/adreno: Convert the show/crash file format
  drm/msm/adreno: Add ringbuffer data to the GPU state
  drm/msm/adreno: Add a5xx specific registers for the GPU state
  drm/msm/gpu: Add the buffer objects from the submit to the crash dump

 Documentation/gpu/drm-msm-crash-dump.txt |  46 ++++++
 drivers/gpu/drm/drm_print.c              |  54 +++++++
 drivers/gpu/drm/i915/i915_gpu_error.c    |  35 +----
 drivers/gpu/drm/msm/Kconfig              |   1 +
 drivers/gpu/drm/msm/adreno/a3xx_gpu.c    |  30 ++--
 drivers/gpu/drm/msm/adreno/a4xx_gpu.c    |  22 ++-
 drivers/gpu/drm/msm/adreno/a5xx_gpu.c    | 243 +++++++++++++++++++++++++++++--
 drivers/gpu/drm/msm/adreno/adreno_gpu.c  | 181 ++++++++++++++++++++---
 drivers/gpu/drm/msm/adreno/adreno_gpu.h  |   7 +-
 drivers/gpu/drm/msm/msm_debugfs.c        |  24 ++-
 drivers/gpu/drm/msm/msm_gpu.c            | 143 ++++++++++++++++--
 drivers/gpu/drm/msm/msm_gpu.h            |  67 ++++++++-
 include/drm/drm_print.h                  |  27 ++++
 include/linux/ascii85.h                  |  39 +++++
 14 files changed, 821 insertions(+), 98 deletions(-)
 create mode 100644 Documentation/gpu/drm-msm-crash-dump.txt
 create mode 100644 include/linux/ascii85.h

-- 
2.16.1



More information about the dri-devel mailing list