[PATCH v2] drm/amdgpu: reset asic after system-wide suspend aborted (v2)
Liang, Prike
Prike.Liang at amd.com
Thu Nov 25 04:58:43 UTC 2021
[Public]
> -----Original Message-----
> From: Lazar, Lijo <Lijo.Lazar at amd.com>
> Sent: Wednesday, November 24, 2021 9:30 PM
> To: Liang, Prike <Prike.Liang at amd.com>; amd-gfx at lists.freedesktop.org
> Cc: Deucher, Alexander <Alexander.Deucher at amd.com>; Huang, Ray
> <Ray.Huang at amd.com>
> Subject: Re: [PATCH v2] drm/amdgpu: reset asic after system-wide suspend
> aborted (v2)
>
>
>
> On 11/24/2021 6:13 PM, Prike Liang wrote:
> > Do ASIC reset at the moment Sx suspend aborted behind of amdgpu
> > suspend to keep AMDGPU in a clean reset state and that can avoid
> > re-initialize device improperly error. Currently,we just always do
> > asic reset in the amdgpu resume until sort out the PM abort case.
> >
> > v2: Remove incomplete PM abort flag and add GPU hive case check for
> > GPU reset.
> >
> > Signed-off-by: Prike Liang <Prike.Liang at amd.com>
> > ---
> > drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 8 ++++++++
> > 1 file changed, 8 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> > b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> > index 7d4115d..3fcd90d 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> > @@ -3983,6 +3983,14 @@ int amdgpu_device_resume(struct drm_device
> *dev, bool fbcon)
> > if (adev->in_s0ix)
> > amdgpu_gfx_state_change_set(adev,
> sGpuChangeState_D0Entry);
> >
> > + /*TODO: In order to not let all-always asic reset affect resume
> latency
> > + * need sort out the case which really need asic reset in the resume
> process.
> > + * As to the known issue on the system suspend abort behind the
> AMDGPU suspend,
> > + * may can sort this case by checking struct suspend_stats which
> need exported
> > + * firstly.
> > + */
> > + if (adev->gmc.xgmi.num_physical_nodes <= 1)
> > + amdgpu_asic_reset(adev);
>
> Newer dGPUs depend on PMFW to do reset and that is not loaded at this
> point. For some, there is a mini FW available which could technically handle a
> reset and some of the older ones depend on PSP. Strongly suggest to check
> all such cases before doing a reset here.
>
> Or, the safest at this point could be to do the reset only for APUs.
>
> Thanks,
> Lijo
>
Thanks for the input, that may need a lot of effort to sort out reset method from many dGPUs.
So in this time let's only handle APUs firstly.
> > /* post card */
> > if (amdgpu_device_need_post(adev)) {
> > r = amdgpu_device_asic_init(adev);
> >
More information about the amd-gfx
mailing list