[PATCH 4/4] drm/amdgpu/nv: add navi GPU reset handler

Sharma, Shashank shashank.sharma at amd.com
Sat Feb 5 07:00:15 UTC 2022


Hey Alex,
Agree, we are moving it above, Christian also had the same feedback.

- Shashank

On 2/4/2022 7:44 PM, Deucher, Alexander wrote:
> [Public]
> 
> 
> Seems like this functionality should be moved up into the callers.  
> Maybe add new IP callbacks (dump_reset_registers()) so that each IP can 
> specify what registers are relevant for a reset debugging and then we 
> can walk the IP list and call the callback before we call the asic_reset 
> callbacks.
> 
> Alex
> 
> ------------------------------------------------------------------------
> *From:* amd-gfx <amd-gfx-bounces at lists.freedesktop.org> on behalf of 
> Deucher, Alexander <Alexander.Deucher at amd.com>
> *Sent:* Friday, February 4, 2022 1:41 PM
> *To:* Sharma, Shashank <Shashank.Sharma at amd.com>; Lazar, Lijo 
> <Lijo.Lazar at amd.com>; amd-gfx at lists.freedesktop.org 
> <amd-gfx at lists.freedesktop.org>
> *Cc:* Somalapuram, Amaranath <Amaranath.Somalapuram at amd.com>; Koenig, 
> Christian <Christian.Koenig at amd.com>
> *Subject:* Re: [PATCH 4/4] drm/amdgpu/nv: add navi GPU reset handler
> 
> [Public]
> 
> 
> [Public]
> 
> 
> In the suspend and hibernate cases, we don't care.  In most cases the 
> power rail will be cut once the system enters suspend so it doesn't 
> really matter.  That's why we call the asic reset callback directly 
> rather than going through the whole recovery process. The device is 
> already quiescent at this point we just want to make sure the device is 
> in a known state when we come out of suspend (in case suspend overall 
> fails).
> 
> Alex
> 
> 
> ------------------------------------------------------------------------
> *From:* Sharma, Shashank <Shashank.Sharma at amd.com>
> *Sent:* Friday, February 4, 2022 12:22 PM
> *To:* Lazar, Lijo <Lijo.Lazar at amd.com>; amd-gfx at lists.freedesktop.org 
> <amd-gfx at lists.freedesktop.org>
> *Cc:* Deucher, Alexander <Alexander.Deucher at amd.com>; Somalapuram, 
> Amaranath <Amaranath.Somalapuram at amd.com>; Koenig, Christian 
> <Christian.Koenig at amd.com>
> *Subject:* Re: [PATCH 4/4] drm/amdgpu/nv: add navi GPU reset handler
> 
> 
> On 2/4/2022 6:20 PM, Lazar, Lijo wrote:
>> [AMD Official Use Only]
>> 
>> One more thing
>>        In suspend-reset case, won't this cause to schedule a work item on suspend? I don't know if that is a good idea, ideally we would like to clean up all work items before going to suspend.
>> 
>> Thanks,
>> Lijo
> 
> Again, this opens scope for discussion. What if there is a GPU error
> during suspend-reset, which is very probable case.
> 
> - Shashank
> 
>> 
>> -----Original Message-----
>> From: Sharma, Shashank <Shashank.Sharma at amd.com>
>> Sent: Friday, February 4, 2022 10:47 PM
>> To: Lazar, Lijo <Lijo.Lazar at amd.com>; amd-gfx at lists.freedesktop.org
>> Cc: Deucher, Alexander <Alexander.Deucher at amd.com>; Somalapuram, Amaranath <Amaranath.Somalapuram at amd.com>; Koenig, Christian <Christian.Koenig at amd.com>
>> Subject: Re: [PATCH 4/4] drm/amdgpu/nv: add navi GPU reset handler
>> 
>> 
>> 
>> On 2/4/2022 6:11 PM, Lazar, Lijo wrote:
>>> BTW, since this is already providing a set of values it would be useful to provide one more field as the reset reason - RAS error recovery, GPU hung recovery or something else.
>> 
>> Adding this additional parameter instead of blocking something in kernel, seems like a better idea. The app can filter out and read what it is interested into.
>> 
>> - Shashank


More information about the amd-gfx mailing list