[PATCH] drm/amdgpu: Reset error code for 'no handler' case

Chen, Guchun Guchun.Chen at amd.com
Mon Mar 29 04:05:59 UTC 2021


[AMD Public Use]

Reviewed-and-tested-by: Guchun Chen guchun.chen at amd.com<mailto:guchun.chen at amd.com>

Regards,
Guchun

From: Lazar, Lijo <Lijo.Lazar at amd.com>
Sent: Monday, March 29, 2021 12:04 PM
To: amd-gfx at lists.freedesktop.org
Cc: Zhang, Hawking <Hawking.Zhang at amd.com>; Xu, Feifei <Feifei.Xu at amd.com>; Chen, Guchun <Guchun.Chen at amd.com>
Subject: [PATCH] drm/amdgpu: Reset error code for 'no handler' case


[AMD Public Use]

If reset handler is not implemented, reset error before proceeding.

Fixes issue with the following trace -
[  106.508592] amdgpu 0000:b1:00.0: amdgpu: ASIC reset failed with error, -38 for drm dev, 0000:b1:00.0
[  106.508972] amdgpu 0000:b1:00.0: amdgpu: GPU reset succeeded, trying to resume
[  106.509116] [drm] PCIE GART of 512M enabled.
[  106.509120] [drm] PTB located at 0x0000008000000000
[  106.509136] [drm] VRAM is lost due to GPU reset!
[  106.509332] [drm] PSP is resuming...

Signed-off-by: Lijo Lazar lijo.lazar at amd.com<mailto:lijo.lazar at amd.com>
---
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 11 ++++++++---
1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 319d69646a13..a501d1a4d000 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -4281,7 +4281,10 @@ int amdgpu_device_pre_asic_reset(struct amdgpu_device *adev,
                               drm_sched_increase_karma(&job->base);

                r = amdgpu_reset_prepare_hwcontext(adev, reset_context);
-              if (r != -ENOSYS)
+             /* If reset handler not implemented, continue; otherwise return */
+             if (r == -ENOSYS)
+                             r = 0;
+             else
                               return r;

                /* Don't suspend on bare metal if we are not going to HW reset the ASIC */
@@ -4323,8 +4326,10 @@ int amdgpu_do_asic_reset(struct list_head *device_list_handle,
               tmp_adev = list_first_entry(device_list_handle, struct amdgpu_device,
                                                                   reset_list);
               r = amdgpu_reset_perform_reset(tmp_adev, reset_context);
-
-              if (r != -ENOSYS)
+             /* If reset handler not implemented, continue; otherwise return */
+             if (r == -ENOSYS)
+                             r = 0;
+             else
                               return r;

                /* Reset handler not implemented, use the default method */
--
2.17.1

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/amd-gfx/attachments/20210329/c7f1944f/attachment-0001.htm>


More information about the amd-gfx mailing list