[PATCH] drm/amdgpu: Direct ret in ras_reset_err_cnt on VF

Rehman, Ahmad Ahmad.Rehman at amd.com
Thu Apr 10 20:50:01 UTC 2025


[AMD Official Use Only - AMD Internal Distribution Only]

Reviewed-by: Ahmad Rehman <Ahmad.Rehman at amd.com>

Thanks,
Ahmad

-----Original Message-----
From: Pan, Ellen <Yunru.Pan at amd.com>
Sent: Thursday, April 3, 2025 10:40 AM
To: amd-gfx at lists.freedesktop.org
Cc: Skvortsov, Victor <Victor.Skvortsov at amd.com>; Rehman, Ahmad <Ahmad.Rehman at amd.com>; Gande, Shravan kumar <Shravankumar.Gande at amd.com>; Pan, Ellen <Yunru.Pan at amd.com>
Subject: [PATCH] drm/amdgpu: Direct ret in ras_reset_err_cnt on VF

With adding sriov_vf check, we directly return EOPNOTSUPP in ras_reset_error_count as we should not do anything on VF to reset RAS error count.

This also fixes the issue that loading guest driver causes register violations.

Signed-off-by: Ellen Pan <yunru.pan at amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
index ebf1f63d0442..f8cf9621097f 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
@@ -1498,6 +1498,9 @@ int amdgpu_ras_reset_error_count(struct amdgpu_device *adev,
            !amdgpu_ras_get_aca_debug_mode(adev))
                return -EOPNOTSUPP;

+       if (amdgpu_sriov_vf(adev))
+               return -EOPNOTSUPP;
+
        /* skip ras error reset in gpu reset */
        if ((amdgpu_in_reset(adev) || amdgpu_ras_in_recovery(adev)) &&
            ((smu_funcs && smu_funcs->set_debug_mode) ||
--
2.34.1



More information about the amd-gfx mailing list