[PATCH v2 2/2] drm/amdgpu: clean up the suspend_complete

Liang, Prike Prike.Liang at amd.com
Thu Oct 24 02:54:00 UTC 2024


[Public]

> From: Lazar, Lijo <Lijo.Lazar at amd.com>
> Sent: Wednesday, October 23, 2024 6:55 PM
> To: Liang, Prike <Prike.Liang at amd.com>; amd-gfx at lists.freedesktop.org
> Cc: Deucher, Alexander <Alexander.Deucher at amd.com>
> Subject: Re: [PATCH v2 2/2] drm/amdgpu: clean up the suspend_complete
>
>
>
> On 10/14/2024 1:19 PM, Prike Liang wrote:
> > To check the status of S3 suspend completion, use the PM core
> > pm_suspend_global_flags bit(1) to detect S3 abort events. Therefore,
> > clean up the AMDGPU driver's private flag suspend_complete.
> >
> > Signed-off-by: Prike Liang <Prike.Liang at amd.com>
> > ---
> >  drivers/gpu/drm/amd/amdgpu/amdgpu.h     | 2 --
> >  drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 2 --
> >  drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c   | 4 ++--
> >  drivers/gpu/drm/amd/amdgpu/soc15.c      | 7 ++-----
> >  drivers/gpu/drm/amd/amdgpu/soc21.c      | 2 +-
> >  5 files changed, 5 insertions(+), 12 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> > b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> > index 48c9b9b06905..9b35763ae0a7 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> > @@ -1111,8 +1111,6 @@ struct amdgpu_device {
> >     bool                            in_s3;
> >     bool                            in_s4;
> >     bool                            in_s0ix;
> > -   /* indicate amdgpu suspension status */
> > -   bool                            suspend_complete;
> >
> >     enum pp_mp1_state               mp1_state;
> >     struct amdgpu_doorbell_index doorbell_index; diff --git
> > a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> > b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> > index 680e44fdee6e..78972151b970 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> > @@ -2501,7 +2501,6 @@ static int amdgpu_pmops_suspend(struct device *dev)
> >     struct drm_device *drm_dev = dev_get_drvdata(dev);
> >     struct amdgpu_device *adev = drm_to_adev(drm_dev);
> >
> > -   adev->suspend_complete = false;
> >     if (amdgpu_acpi_is_s0ix_active(adev))
> >             adev->in_s0ix = true;
> >     else if (amdgpu_acpi_is_s3_active(adev)) @@ -2516,7 +2515,6 @@
> > static int amdgpu_pmops_suspend_noirq(struct device *dev)
> >     struct drm_device *drm_dev = dev_get_drvdata(dev);
> >     struct amdgpu_device *adev = drm_to_adev(drm_dev);
> >
> > -   adev->suspend_complete = true;
> >     if (amdgpu_acpi_should_gpu_reset(adev))
> >             return amdgpu_asic_reset(adev);
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
> > b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
> > index be320d753507..ba8e66744376 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
> > @@ -3276,8 +3276,8 @@ static int gfx_v9_0_cp_gfx_start(struct amdgpu_device
> *adev)
> >      * confirmed that the APU gfx10/gfx11 needn't such update.
> >      */
> >     if (adev->flags & AMD_IS_APU &&
> > -                   adev->in_s3 && !adev->suspend_complete) {
> > -           DRM_INFO(" Will skip the CSB packet resubmit\n");
> > +                   adev->in_s3 && !pm_resume_via_firmware()) {
> > +           DRM_INFO("Will skip the CSB packet resubmit\n");
> >             return 0;
> >     }
> >     r = amdgpu_ring_alloc(ring, gfx_v9_0_get_csb_size(adev) + 4 + 3);
> > diff --git a/drivers/gpu/drm/amd/amdgpu/soc15.c
> > b/drivers/gpu/drm/amd/amdgpu/soc15.c
> > index 12ff6cf568dc..d9d11131a744 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/soc15.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/soc15.c
> > @@ -584,13 +584,10 @@ static bool soc15_need_reset_on_resume(struct
> amdgpu_device *adev)
> >      *    performing pm core test.
> >      */
> >     if (adev->flags & AMD_IS_APU && adev->in_s3 &&
> > -                   !pm_resume_via_firmware()) {
> > -           adev->suspend_complete = false;
> > +                   !pm_resume_via_firmware())
> >             return true;
> > -   } else {
> > -           adev->suspend_complete = true;
> > +   else
> >             return false;
> > -   }
> >  }
> >
> >  static int soc15_asic_reset(struct amdgpu_device *adev) diff --git
> > a/drivers/gpu/drm/amd/amdgpu/soc21.c
> > b/drivers/gpu/drm/amd/amdgpu/soc21.c
> > index c4b950e75133..7a47a21ef00f 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/soc21.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/soc21.c
> > @@ -904,7 +904,7 @@ static bool soc21_need_reset_on_resume(struct
> amdgpu_device *adev)
> >      * 2) S3 suspend got aborted and TOS is active.
> >      */
> >     if (!(adev->flags & AMD_IS_APU) && adev->in_s3 &&
> > -       !adev->suspend_complete) {
> > +       !pm_resume_via_firmware()) {
>
> Looks like this will cover only ACPI based systems. Not sure if that assumption is
> valid for dGPU cases.
>
> Thanks,
> Lijo

Yes, the pm_set_resume_via_firmware() function is only called during the ACPI_STATE_S3 suspend process. However, ACPI-enabled systems are popular in the desktop world. If there are concerns about ACPI configuration, one option could be to check if the dGPU needs a reset by directly checking the SOL register. As far as I can see, when the dGPU completes its suspend process, the SOL value will remain zero until the dGPU is resumed. Conversely, in the case of a suspend abort, the SOL value will be non-zero.

Thanks,
Prike
>
> >             sol_reg1 = RREG32_SOC15(MP0, 0, regMP0_SMN_C2PMSG_81);
> >             msleep(100);
> >             sol_reg2 = RREG32_SOC15(MP0, 0, regMP0_SMN_C2PMSG_81);


More information about the amd-gfx mailing list