[PATCH 1/2] drm/amdgpu: correct the S3 abort check condition

Liang, Prike Prike.Liang at amd.com
Sat Oct 12 08:03:48 UTC 2024


[AMD Official Use Only - AMD Internal Distribution Only]

> From: Lazar, Lijo <Lijo.Lazar at amd.com>
> Sent: Friday, October 11, 2024 5:21 PM
> To: Liang, Prike <Prike.Liang at amd.com>; amd-gfx at lists.freedesktop.org
> Subject: Re: [PATCH 1/2] drm/amdgpu: correct the S3 abort check condition
>
>
>
> On 10/11/2024 11:31 AM, Prike Liang wrote:
> > In the normal S3 entry, the TOS cycle counter is not reset during BIOS
> > execution the _S3 method, so it doesn't determine whether the _S3
> > method is executed exactly.
> > Howerver, the PM core performs the S3 suspend will set the
> > PM_SUSPEND_FLAG_FW_RESUME bit if all the devices suspend successfully.
> > Therefore, drivers can check the pm_suspend_global_flags bit(1) to
> > detect the S3 suspend abort event.
> >
> > Fixes: 4d58c599df75 ("drm/amdgpu: update suspend status for aborting
> > from deeper suspend")
> > Signed-off-by: Prike Liang <Prike.Liang at amd.com>
> > ---
> >  drivers/gpu/drm/amd/amdgpu/soc15.c | 26 +++++++++++++++++---------
> >  1 file changed, 17 insertions(+), 9 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/soc15.c
> > b/drivers/gpu/drm/amd/amdgpu/soc15.c
> > index 619933f252aa..d07cf8137bd7 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/soc15.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/soc15.c
> > @@ -578,16 +578,18 @@ soc15_asic_reset_method(struct amdgpu_device
> > *adev)
> >
> >  static bool soc15_need_reset_on_resume(struct amdgpu_device *adev)  {
> > -   u32 sol_reg;
> > -
> > -   sol_reg = RREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_81);
> > +   bool suspend_complete;
> >
> >     /* Will reset for the following suspend abort cases.
> > -    * 1) Only reset limit on APU side, dGPU hasn't checked yet.
> > -    * 2) S3 suspend abort and TOS already launched.
> > +    * 1) Only reset on APU side, dGPU hasn't checked yet.
> > +    * 2) S3 suspend aborted in the normal S3 suspend or
> > +    *    performing pm core test.
> >      */
> > +   suspend_complete = !!(pm_suspend_global_flags &
> > +                           PM_SUSPEND_FLAG_FW_RESUME);
> > +
>
> Is this expected to be done after hibernate also? There is a function to check this -
> pm_resume_via_firmware()
>
Yes, it also calls pm_set_resume_via_firmware() in hibernate, but we haven't received any S4 suspend abort issues yet,
and this quirk is currently limited to S3. Initially, I thought of using pm_suspend_global_flags to help track the flag value,
but now it seems that using pm_resume_via_firmware() will be neater.

>
> >     if (adev->flags & AMD_IS_APU && adev->in_s3 &&
> > -                   sol_reg) {
> > +                   !suspend_complete) {
> >             adev->suspend_complete = false;
> >             return true;
> >     } else {
> > @@ -603,11 +605,17 @@ static int soc15_asic_reset(struct amdgpu_device
> *adev)
> >      * successfully. So now, temporarily enable it for the
> >      * S3 suspend abort case.
> >      */
> > -   if (((adev->apu_flags & AMD_APU_IS_RAVEN) ||
> > -       (adev->apu_flags & AMD_APU_IS_RAVEN2)) &&
> > -           !soc15_need_reset_on_resume(adev))
> > +
> > +   if ((adev->apu_flags & AMD_APU_IS_PICASSO ||
> > +                   !(adev->apu_flags & AMD_APU_IS_RAVEN)) &&
> > +                   soc15_need_reset_on_resume(adev))
> > +           goto asic_reset;
> > +
> > +   if ((adev->apu_flags & AMD_APU_IS_RAVEN) ||
> > +                   (adev->apu_flags & AMD_APU_IS_RAVEN2))
> >             return 0;
>
> If this check happens first, then !(adev->apu_flags & AMD_APU_IS_RAVEN) and
> goto/label can be avoided.
>
> Thanks,
> Lijo

There are various Raven variants such as Raven2 and PICASSO, all of which are marked as Raven.
However, here we only want to filter out the original Raven device that does not support mode2 reset.
Apart from the original Raven device, other gfx9 APU devices require a mode2 reset in S3 abort resume
cases.

Thanks,
Prike
> >
> > +asic_reset:
> >     switch (soc15_asic_reset_method(adev)) {
> >     case AMD_RESET_METHOD_PCI:
> >             dev_info(adev->dev, "PCI reset\n");


More information about the amd-gfx mailing list