[PATCH 4/4] drm/amdgpu/nv: add navi GPU reset handler
Sharma, Shashank
shashank.sharma at amd.com
Sat Feb 5 07:00:15 UTC 2022
Hey Alex,
Agree, we are moving it above, Christian also had the same feedback.
- Shashank
On 2/4/2022 7:44 PM, Deucher, Alexander wrote:
> [Public]
>
>
> Seems like this functionality should be moved up into the callers.
> Maybe add new IP callbacks (dump_reset_registers()) so that each IP can
> specify what registers are relevant for a reset debugging and then we
> can walk the IP list and call the callback before we call the asic_reset
> callbacks.
>
> Alex
>
> ------------------------------------------------------------------------
> *From:* amd-gfx <amd-gfx-bounces at lists.freedesktop.org> on behalf of
> Deucher, Alexander <Alexander.Deucher at amd.com>
> *Sent:* Friday, February 4, 2022 1:41 PM
> *To:* Sharma, Shashank <Shashank.Sharma at amd.com>; Lazar, Lijo
> <Lijo.Lazar at amd.com>; amd-gfx at lists.freedesktop.org
> <amd-gfx at lists.freedesktop.org>
> *Cc:* Somalapuram, Amaranath <Amaranath.Somalapuram at amd.com>; Koenig,
> Christian <Christian.Koenig at amd.com>
> *Subject:* Re: [PATCH 4/4] drm/amdgpu/nv: add navi GPU reset handler
>
> [Public]
>
>
> [Public]
>
>
> In the suspend and hibernate cases, we don't care. In most cases the
> power rail will be cut once the system enters suspend so it doesn't
> really matter. That's why we call the asic reset callback directly
> rather than going through the whole recovery process. The device is
> already quiescent at this point we just want to make sure the device is
> in a known state when we come out of suspend (in case suspend overall
> fails).
>
> Alex
>
>
> ------------------------------------------------------------------------
> *From:* Sharma, Shashank <Shashank.Sharma at amd.com>
> *Sent:* Friday, February 4, 2022 12:22 PM
> *To:* Lazar, Lijo <Lijo.Lazar at amd.com>; amd-gfx at lists.freedesktop.org
> <amd-gfx at lists.freedesktop.org>
> *Cc:* Deucher, Alexander <Alexander.Deucher at amd.com>; Somalapuram,
> Amaranath <Amaranath.Somalapuram at amd.com>; Koenig, Christian
> <Christian.Koenig at amd.com>
> *Subject:* Re: [PATCH 4/4] drm/amdgpu/nv: add navi GPU reset handler
>
>
> On 2/4/2022 6:20 PM, Lazar, Lijo wrote:
>> [AMD Official Use Only]
>>
>> One more thing
>> In suspend-reset case, won't this cause to schedule a work item on suspend? I don't know if that is a good idea, ideally we would like to clean up all work items before going to suspend.
>>
>> Thanks,
>> Lijo
>
> Again, this opens scope for discussion. What if there is a GPU error
> during suspend-reset, which is very probable case.
>
> - Shashank
>
>>
>> -----Original Message-----
>> From: Sharma, Shashank <Shashank.Sharma at amd.com>
>> Sent: Friday, February 4, 2022 10:47 PM
>> To: Lazar, Lijo <Lijo.Lazar at amd.com>; amd-gfx at lists.freedesktop.org
>> Cc: Deucher, Alexander <Alexander.Deucher at amd.com>; Somalapuram, Amaranath <Amaranath.Somalapuram at amd.com>; Koenig, Christian <Christian.Koenig at amd.com>
>> Subject: Re: [PATCH 4/4] drm/amdgpu/nv: add navi GPU reset handler
>>
>>
>>
>> On 2/4/2022 6:11 PM, Lazar, Lijo wrote:
>>> BTW, since this is already providing a set of values it would be useful to provide one more field as the reset reason - RAS error recovery, GPU hung recovery or something else.
>>
>> Adding this additional parameter instead of blocking something in kernel, seems like a better idea. The app can filter out and read what it is interested into.
>>
>> - Shashank
More information about the amd-gfx
mailing list