[PATCH 00/20] RAS support for amdgpu

Alex Deucher alexdeucher at gmail.com
Tue Mar 5 20:41:16 UTC 2019


This patch set adds initial RAS (Reliability, Availability, Serviceability)
support to amdgpu on supported boards.  Features include SRAM and VRAM ECC,
bad page tracking, and error containment.

Eric Huang (2):
  drm/amdkfd: add RAS capabilities in topology for Vega20 (v2)
  drm/amdkfd: add RAS ECC event support (v2)

Feifei Xu (1):
  drm/amdgpu: enable ras on gfx9 (v2)

xinhui pan (17):
  drm/amdgpu: add ta ras fw info (v2)
  drm/amdgpu: export ta fw info
  drm/amdgpu: add module parameters for ras
  drm/amdgpu: add ta_ras_if.h
  drm/amdgpu: add psp ras callback func and macro
  drm/amdgpu: add psp ras subsystem infrastructure (v2)
  drm/amdgpu: add psp v11 ras callback
  drm/amdgpu: add psp cmd submit timeout
  drm/amdgpu: add amdgpu_ras.c to support ras (v2)
  drm/amdgpu: add debugfs ctrl node
  drm/amdgpu: reserve bad pages during recovery
  drm/amdgpu: enable ras on sdma4
  drm/amdgpu: enable ras on gmc9
  drm/amdgpu: Add a new flag to AMDGPU_CTX_OP_QUERY_STATE2
  drm/amdgpu: add ioctl query for enabled ras features (v2)
  drm/amdgpu: skip gpu reset when ras error occured
  drm/amdgpu: add human readable debugfs control support (v2)

 drivers/gpu/drm/amd/amdgpu/Makefile        |    2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu.h        |    2 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h |    1 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c    |   17 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.h    |    2 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c |   14 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c    |   17 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h    |    3 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h    |    2 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c    |   31 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c    |  220 ++-
 drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h    |   32 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c    | 1438 ++++++++++++++++++++
 drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h    |  229 ++++
 drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.h   |    4 +
 drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c      |  174 +++
 drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c      |  277 ++++
 drivers/gpu/drm/amd/amdgpu/psp_v11_0.c     |   57 +
 drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c     |  186 ++-
 drivers/gpu/drm/amd/amdgpu/ta_ras_if.h     |  108 ++
 drivers/gpu/drm/amd/amdkfd/kfd_device.c    |   11 +
 drivers/gpu/drm/amd/amdkfd/kfd_events.c    |   18 +-
 drivers/gpu/drm/amd/amdkfd/kfd_priv.h      |    3 +
 drivers/gpu/drm/amd/amdkfd/kfd_topology.c  |   16 +
 drivers/gpu/drm/amd/amdkfd/kfd_topology.h  |    4 +
 include/uapi/drm/amdgpu_drm.h              |   35 +
 include/uapi/linux/kfd_ioctl.h             |   12 +-
 27 files changed, 2910 insertions(+), 5 deletions(-)
 create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
 create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h
 create mode 100644 drivers/gpu/drm/amd/amdgpu/ta_ras_if.h

-- 
2.20.1



More information about the amd-gfx mailing list