[PATCH] drm/amdgpu: update suspend status for aborting from deeper suspend

Alex Deucher alexdeucher at gmail.com
Mon Sep 9 15:23:24 UTC 2024


On Mon, Sep 9, 2024 at 8:58 AM Liang, Prike <Prike.Liang at amd.com> wrote:
>
> [AMD Official Use Only - AMD Internal Distribution Only]
>
>
> Previously, the S3 process aborted before calling the noirq suspend, and this issue was successfully sorted by checking the suspend_complete flag. However, there are now some S3 suspend cases, such as pm_test platform/core mode, which abort the S3 process after the noirq suspend. In these cases of abort, the issue cannot be sorted out by setting the suspend_complete flag in the noirq suspend callback, and it’s fine to use the MP0 SOL register directly to determine whether to reset the GPU on resume. However, on the GFX9 series, the driver still needs to rely on the suspend_complete flag to determine whether it needs to skip reprogramming the clear state register values during resume from suspend abort cases, so now it sounds that the suspend_complete flag cannot be completely removed.

Can we just set the suspend_complete flag based on the SOL register
rather than based on what functions have been called?  Maybe as a
future cleanup?  This logic seems fragile and I'm worried it will get
accidently broken.  For now the patch is:
Acked-by: Alex Deucher <alexander.deucher at amd.com>

Alex

>
>
>
> Thanks,
>
> Prikec
>
>
>
> From: Deucher, Alexander <Alexander.Deucher at amd.com>
> Sent: Saturday, September 7, 2024 1:34 AM
> To: Liang, Prike <Prike.Liang at amd.com>; amd-gfx at lists.freedesktop.org
> Subject: Re: [PATCH] drm/amdgpu: update suspend status for aborting from deeper suspend
>
>
>
> [AMD Official Use Only - AMD Internal Distribution Only]
>
>
>
> Can you elaborate on how this fails?  Seems like maybe we should just get rid of adev->suspend_complete and just check the MP0 SOL register to determine whether or not we need to reset the GPU on resume.
>
>
>
> Alex
>
>
>
> ________________________________
>
> From: Liang, Prike <Prike.Liang at amd.com>
> Sent: Thursday, September 5, 2024 3:36 AM
> To: amd-gfx at lists.freedesktop.org <amd-gfx at lists.freedesktop.org>
> Cc: Deucher, Alexander <Alexander.Deucher at amd.com>
> Subject: RE: [PATCH] drm/amdgpu: update suspend status for aborting from deeper suspend
>
>
>
> [AMD Official Use Only - AMD Internal Distribution Only]
>
> According to the ChromeOS team test, this patch can resolve the S3 suspend abort from deeper sleep, which occurs when suspension aborts after calling the noirq suspend and before executing the _S3 and turning off the power rail.
>
> Could this patch get a review or acknowledgment?
>
> Thanks,
> Prike
>
> > -----Original Message-----
> > From: Liang, Prike <Prike.Liang at amd.com>
> > Sent: Monday, September 2, 2024 4:13 PM
> > To: amd-gfx at lists.freedesktop.org
> > Cc: Deucher, Alexander <Alexander.Deucher at amd.com>; Liang, Prike
> > <Prike.Liang at amd.com>
> > Subject: [PATCH] drm/amdgpu: update suspend status for aborting from
> > deeper suspend
> >
> > There're some other suspend abort cases which can call the noirq suspend
> > except for executing _S3 method. In those cases need to process as
> > incomplete suspendsion.
> >
> > Signed-off-by: Prike Liang <Prike.Liang at amd.com>
> > ---
> >  drivers/gpu/drm/amd/amdgpu/soc15.c | 10 ++++++----
> >  1 file changed, 6 insertions(+), 4 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/soc15.c
> > b/drivers/gpu/drm/amd/amdgpu/soc15.c
> > index 8d16dacdc172..cf701bb8fc79 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/soc15.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/soc15.c
> > @@ -587,11 +587,13 @@ static bool soc15_need_reset_on_resume(struct
> > amdgpu_device *adev)
> >        * 2) S3 suspend abort and TOS already launched.
> >        */
> >       if (adev->flags & AMD_IS_APU && adev->in_s3 &&
> > -                     !adev->suspend_complete &&
> > -                     sol_reg)
> > +                     sol_reg) {
> > +             adev->suspend_complete = false;
> >               return true;
> > -
> > -     return false;
> > +     } else {
> > +             adev->suspend_complete = true;
> > +             return false;
> > +     }
> >  }
> >
> >  static int soc15_asic_reset(struct amdgpu_device *adev)
> > --
> > 2.34.1


More information about the amd-gfx mailing list