[PATCH v2 2/2] drm/amdgpu: clean up the suspend_complete

Lazar, Lijo lijo.lazar at amd.com
Thu Oct 24 03:38:43 UTC 2024



On 10/24/2024 8:24 AM, Liang, Prike wrote:
> [Public]
> 
>> From: Lazar, Lijo <Lijo.Lazar at amd.com>
>> Sent: Wednesday, October 23, 2024 6:55 PM
>> To: Liang, Prike <Prike.Liang at amd.com>; amd-gfx at lists.freedesktop.org
>> Cc: Deucher, Alexander <Alexander.Deucher at amd.com>
>> Subject: Re: [PATCH v2 2/2] drm/amdgpu: clean up the suspend_complete
>>
>>
>>
>> On 10/14/2024 1:19 PM, Prike Liang wrote:
>>> To check the status of S3 suspend completion, use the PM core
>>> pm_suspend_global_flags bit(1) to detect S3 abort events. Therefore,
>>> clean up the AMDGPU driver's private flag suspend_complete.
>>>
>>> Signed-off-by: Prike Liang <Prike.Liang at amd.com>
>>> ---
>>>  drivers/gpu/drm/amd/amdgpu/amdgpu.h     | 2 --
>>>  drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 2 --
>>>  drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c   | 4 ++--
>>>  drivers/gpu/drm/amd/amdgpu/soc15.c      | 7 ++-----
>>>  drivers/gpu/drm/amd/amdgpu/soc21.c      | 2 +-
>>>  5 files changed, 5 insertions(+), 12 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>> index 48c9b9b06905..9b35763ae0a7 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>> @@ -1111,8 +1111,6 @@ struct amdgpu_device {
>>>     bool                            in_s3;
>>>     bool                            in_s4;
>>>     bool                            in_s0ix;
>>> -   /* indicate amdgpu suspension status */
>>> -   bool                            suspend_complete;
>>>
>>>     enum pp_mp1_state               mp1_state;
>>>     struct amdgpu_doorbell_index doorbell_index; diff --git
>>> a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>>> index 680e44fdee6e..78972151b970 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>>> @@ -2501,7 +2501,6 @@ static int amdgpu_pmops_suspend(struct device *dev)
>>>     struct drm_device *drm_dev = dev_get_drvdata(dev);
>>>     struct amdgpu_device *adev = drm_to_adev(drm_dev);
>>>
>>> -   adev->suspend_complete = false;
>>>     if (amdgpu_acpi_is_s0ix_active(adev))
>>>             adev->in_s0ix = true;
>>>     else if (amdgpu_acpi_is_s3_active(adev)) @@ -2516,7 +2515,6 @@
>>> static int amdgpu_pmops_suspend_noirq(struct device *dev)
>>>     struct drm_device *drm_dev = dev_get_drvdata(dev);
>>>     struct amdgpu_device *adev = drm_to_adev(drm_dev);
>>>
>>> -   adev->suspend_complete = true;
>>>     if (amdgpu_acpi_should_gpu_reset(adev))
>>>             return amdgpu_asic_reset(adev);
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
>>> b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
>>> index be320d753507..ba8e66744376 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
>>> @@ -3276,8 +3276,8 @@ static int gfx_v9_0_cp_gfx_start(struct amdgpu_device
>> *adev)
>>>      * confirmed that the APU gfx10/gfx11 needn't such update.
>>>      */
>>>     if (adev->flags & AMD_IS_APU &&
>>> -                   adev->in_s3 && !adev->suspend_complete) {
>>> -           DRM_INFO(" Will skip the CSB packet resubmit\n");
>>> +                   adev->in_s3 && !pm_resume_via_firmware()) {
>>> +           DRM_INFO("Will skip the CSB packet resubmit\n");
>>>             return 0;
>>>     }
>>>     r = amdgpu_ring_alloc(ring, gfx_v9_0_get_csb_size(adev) + 4 + 3);
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/soc15.c
>>> b/drivers/gpu/drm/amd/amdgpu/soc15.c
>>> index 12ff6cf568dc..d9d11131a744 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/soc15.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/soc15.c
>>> @@ -584,13 +584,10 @@ static bool soc15_need_reset_on_resume(struct
>> amdgpu_device *adev)
>>>      *    performing pm core test.
>>>      */
>>>     if (adev->flags & AMD_IS_APU && adev->in_s3 &&
>>> -                   !pm_resume_via_firmware()) {
>>> -           adev->suspend_complete = false;
>>> +                   !pm_resume_via_firmware())
>>>             return true;
>>> -   } else {
>>> -           adev->suspend_complete = true;
>>> +   else
>>>             return false;
>>> -   }
>>>  }
>>>
>>>  static int soc15_asic_reset(struct amdgpu_device *adev) diff --git
>>> a/drivers/gpu/drm/amd/amdgpu/soc21.c
>>> b/drivers/gpu/drm/amd/amdgpu/soc21.c
>>> index c4b950e75133..7a47a21ef00f 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/soc21.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/soc21.c
>>> @@ -904,7 +904,7 @@ static bool soc21_need_reset_on_resume(struct
>> amdgpu_device *adev)
>>>      * 2) S3 suspend got aborted and TOS is active.
>>>      */
>>>     if (!(adev->flags & AMD_IS_APU) && adev->in_s3 &&
>>> -       !adev->suspend_complete) {
>>> +       !pm_resume_via_firmware()) {
>>
>> Looks like this will cover only ACPI based systems. Not sure if that assumption is
>> valid for dGPU cases.
>>
>> Thanks,
>> Lijo
> 
> Yes, the pm_set_resume_via_firmware() function is only called during the ACPI_STATE_S3 suspend process. However, ACPI-enabled systems are popular in the desktop world. If there are concerns about ACPI configuration, one option could be to check if the dGPU needs a reset by directly checking the SOL register. As far as I can see, when the dGPU completes its suspend process, the SOL value will remain zero until the dGPU is resumed. Conversely, in the case of a suspend abort, the SOL value will be non-zero.
> 

in_s3 is set for dGPU in case of s0ix as well. Probably, that's the only
case where need the flag to avoid unnecessary reset. Otherwise SOL check
could be sufficient.

Thanks,
Lijo

> Thanks,
> Prike
>>
>>>             sol_reg1 = RREG32_SOC15(MP0, 0, regMP0_SMN_C2PMSG_81);
>>>             msleep(100);
>>>             sol_reg2 = RREG32_SOC15(MP0, 0, regMP0_SMN_C2PMSG_81);


More information about the amd-gfx mailing list