[PATCH 0/4] reset ras error counters in initialization sequence

Hawking Zhang Hawking.Zhang at amd.com
Mon Mar 2 10:33:35 UTC 2020


The RAS hw error counters in most IP blocks could be dirty ones
after cold reboot. Read operation is required to reset those regs
to 0 so that user won't get random value when query those counters
via sysfs nodes.

In addition, the reset_ras_error_counter is also important interface
for amdgpu ras tool to force reset hw register counters.

Hawking Zhang (4):
  drm/amdgpu: add reset_ras_error_count function for SDMA
  drm/amdgpu: add reset_ras_error_count function for MMHUB
  drm/amdgpu: add reset_ras_error_count function for GFX
  drm/amdgpu: add reset_ras_error_count function for HDP

 drivers/gpu/drm/amd/amdgpu/amdgpu.h       |  1 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h   |  1 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_mmhub.h |  1 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.h  |  1 +
 drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c     | 27 +++++++++--------------
 drivers/gpu/drm/amd/amdgpu/gfx_v9_4.c     |  2 +-
 drivers/gpu/drm/amd/amdgpu/gfx_v9_4.h     |  2 ++
 drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c     |  3 +++
 drivers/gpu/drm/amd/amdgpu/mmhub_v1_0.c   | 12 ++++++++++
 drivers/gpu/drm/amd/amdgpu/mmhub_v9_4.c   | 12 ++++++++++
 drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c    | 20 ++++++++++++-----
 drivers/gpu/drm/amd/amdgpu/soc15.c        | 14 ++++++++++++
 12 files changed, 72 insertions(+), 24 deletions(-)

-- 
2.17.1



More information about the amd-gfx mailing list