[PATCH V2] drm/amdgpu: Fix ras mode2 reset failure in ras aca mode

Wang, Yang(Kevin) KevinYang.Wang at amd.com
Thu Apr 25 03:19:44 UTC 2024


[AMD Official Use Only - General]

>> Alternatively, we need to explore the opportunity to centralize legacy ras and aca ras implementation in the same API. Take sysfs create/remove interface for example, legacy RAS and ACA RAS do share the same logic, just have different filesystem node.
>> For now, ACA RAS is trending to back to IP specific ras late init. Let's revisit the code to see if we can re-use the common ras_late_init or create aca_ras_late_init api.

Sure, thanks.
We will make improvements in this direction.

Best Regards,
Kevin

-----Original Message-----
From: Zhang, Hawking <Hawking.Zhang at amd.com>
Sent: Thursday, April 25, 2024 10:46 AM
To: Chai, Thomas <YiPeng.Chai at amd.com>; amd-gfx at lists.freedesktop.org
Cc: Zhou1, Tao <Tao.Zhou1 at amd.com>; Li, Candice <Candice.Li at amd.com>; Wang, Yang(Kevin) <KevinYang.Wang at amd.com>; Yang, Stanley <Stanley.Yang at amd.com>
Subject: RE: [PATCH V2] drm/amdgpu: Fix ras mode2 reset failure in ras aca mode

[AMD Official Use Only - General]

The patch is Reviewed-by: Hawking Zhang <Hawking.Zhang at amd.com>

Kevin, Thomas,

Alternatively, we need to explore the opportunity to centralize legacy ras and aca ras implementation in the same API. Take sysfs create/remove interface for example, legacy RAS and ACA RAS do share the same logic, just have different filesystem node.

For now, ACA RAS is trending to back to IP specific ras late init. Let's revisit the code to see if we can re-use the common ras_late_init or create aca_ras_late_init api.

Regards,
Hawking

-----Original Message-----
From: Chai, Thomas <YiPeng.Chai at amd.com>
Sent: Wednesday, April 24, 2024 13:52
To: amd-gfx at lists.freedesktop.org
Cc: Chai, Thomas <YiPeng.Chai at amd.com>; Zhang, Hawking <Hawking.Zhang at amd.com>; Zhou1, Tao <Tao.Zhou1 at amd.com>; Li, Candice <Candice.Li at amd.com>; Wang, Yang(Kevin) <KevinYang.Wang at amd.com>; Yang, Stanley <Stanley.Yang at amd.com>; Chai, Thomas <YiPeng.Chai at amd.com>
Subject: [PATCH V2] drm/amdgpu: Fix ras mode2 reset failure in ras aca mode

Fix ras mode2 reset failure in ras aca mode.

Signed-off-by: YiPeng Chai <YiPeng.Chai at amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
index edb3cd0cef96..11a70991152c 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
@@ -1254,6 +1254,10 @@ int amdgpu_ras_bind_aca(struct amdgpu_device *adev, enum amdgpu_ras_block blk,  {
        struct ras_manager *obj;

+       /* in resume phase, no need to create aca fs node */
+       if (adev->in_suspend || amdgpu_in_reset(adev))
+               return 0;
+
        obj = get_ras_manager(adev, blk);
        if (!obj)
                return -EINVAL;
--
2.34.1




More information about the amd-gfx mailing list