[PATCH v3 1/4] drm/amd: Add support for prepare() and complete() callbacks

Mario Limonciello mario.limonciello at amd.com
Wed Oct 4 03:39:26 UTC 2023


On 10/3/2023 16:22, Deucher, Alexander wrote:
> [Public]
> 
>> -----Original Message-----
>> From: Limonciello, Mario <Mario.Limonciello at amd.com>
>> Sent: Tuesday, October 3, 2023 5:17 PM
>> To: Deucher, Alexander <Alexander.Deucher at amd.com>; amd-
>> gfx at lists.freedesktop.org
>> Cc: Wentland, Harry <Harry.Wentland at amd.com>
>> Subject: Re: [PATCH v3 1/4] drm/amd: Add support for prepare() and
>> complete() callbacks
>>
>> On 10/3/2023 16:11, Deucher, Alexander wrote:
>>> [Public]
>>>
>>>> -----Original Message-----
>>>> From: amd-gfx <amd-gfx-bounces at lists.freedesktop.org> On Behalf Of
>>>> Mario Limonciello
>>>> Sent: Tuesday, October 3, 2023 4:55 PM
>>>> To: amd-gfx at lists.freedesktop.org
>>>> Cc: Wentland, Harry <Harry.Wentland at amd.com>; Limonciello, Mario
>>>> <Mario.Limonciello at amd.com>
>>>> Subject: [PATCH v3 1/4] drm/amd: Add support for prepare() and
>>>> complete() callbacks
>>>>
>>>> Linux PM core has a prepare() callback run before suspend and
>>>> complete() callback ran after resume() for devices to use.  Add
>>>> plumbing to bring
>>>> prepare() to amdgpu.
>>>>
>>>> The idea with the new vfuncs for amdgpu is that all IP blocks that
>>>> memory allocations during suspend should do the allocation from this
>>>> call instead of the suspend() callback.
>>>>
>>>> By moving the allocations to prepare() the system suspend will be
>>>> failed before any IP block has done any suspend code.
>>>>
>>>> If the suspend fails, then do any cleanups in the complete() callback.
>>>>
>>>> Signed-off-by: Mario Limonciello <mario.limonciello at amd.com>
>>>> ---
>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu.h        |  2 ++
>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 39
>>>> ++++++++++++++++++++--
>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c    | 11 +++---
>>>>    3 files changed, 46 insertions(+), 6 deletions(-)
>>>>
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>>> index 73e825d20259..5d651552822c 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>>> @@ -1415,6 +1415,8 @@ void amdgpu_driver_postclose_kms(struct
>>>> drm_device *dev,  void amdgpu_driver_release_kms(struct drm_device
>>>> *dev);
>>>>
>>>>    int amdgpu_device_ip_suspend(struct amdgpu_device *adev);
>>>> +int amdgpu_device_prepare(struct drm_device *dev); void
>>>> +amdgpu_device_complete(struct drm_device *dev);
>>>>    int amdgpu_device_suspend(struct drm_device *dev, bool fbcon);  int
>>>> amdgpu_device_resume(struct drm_device *dev, bool fbcon);
>>>>    u32 amdgpu_get_vblank_counter_kms(struct drm_crtc *crtc); diff
>>>> --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>> index bad2b5577e96..f53cf675c3ce 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>> @@ -4259,6 +4259,43 @@ static int
>>>> amdgpu_device_evict_resources(struct
>>>> amdgpu_device *adev)
>>>>    /*
>>>>     * Suspend & resume.
>>>>     */
>>>> +/**
>>>> + * amdgpu_device_prepare - prepare for device suspend
>>>> + *
>>>> + * @dev: drm dev pointer
>>>> + *
>>>> + * Prepare to put the hw in the suspend state (all asics).
>>>> + * Returns 0 for success or an error on failure.
>>>> + * Called at driver suspend.
>>>> + */
>>>> +int amdgpu_device_prepare(struct drm_device *dev) {
>>>> +     struct amdgpu_device *adev = drm_to_adev(dev);
>>>> +     int r;
>>>> +
>>>> +     if (dev->switch_power_state == DRM_SWITCH_POWER_OFF)
>>>> +             return 0;
>>>> +
>>>> +     adev->in_suspend = true;
>>>> +
>>>> +     return 0;
>>>> +}
>>>> +
>>>> +/**
>>>> + * amdgpu_device_complete - complete the device after resume
>>>> + *
>>>> + * @dev: drm dev pointer
>>>> + *
>>>> + * Clean up any actions that the prepare step did.
>>>> + * Called after driver resume.
>>>> + */
>>>> +void amdgpu_device_complete(struct drm_device *dev) {
>>>> +     struct amdgpu_device *adev = drm_to_adev(dev);
>>>> +
>>>> +     adev->in_suspend = false;
>>>> +}
>>>> +
>>>>    /**
>>>>     * amdgpu_device_suspend - initiate device suspend
>>>>     *
>>>> @@ -4277,8 +4314,6 @@ int amdgpu_device_suspend(struct drm_device
>>>> *dev, bool fbcon)
>>>>         if (dev->switch_power_state == DRM_SWITCH_POWER_OFF)
>>>>                 return 0;
>>>>
>>>> -     adev->in_suspend = true;
>>>> -
>>>
>>> We also set this to false in amdgpu_device_resume() so that should be fixed
>> up as well.  But, I'm not sure we want to move this out of
>> amdgpu_device_suspend().  There are places we use
>> amdgpu_device_suspend/resume() outside of pmops that also rely on these
>> being set.  Those places may need to be fixed up if we do.  IIRC, the switcheroo
>> code uses this.
>>
>> The big reason that I moved it from suspend() to prepare() was so that
>> amdgpu_device_evict_resources() was called with the context of it being set.
>>
>> My thought process:
>> 0) prepare() sets all the time
>> 1) If prepare() fails complete() clears it.
>> 2) If prepare() succeeds it remains set for suspend()
>> 3) If suspend() succeeds it gets cleared at resume()
>> 4) If resume() failed for some reason, it's cleared by complete().
>>
>> Does it actually matter that it's set while evicting resources?
> 
> Shouldn't matter for evicting resources.  We even have debugfs nodes you can access to forcibly evict resources at runtime for testing memory pressure.

Then in that case I think what I'll do is put an extra call for 
amdgpu_device_evict_resources() in the prepare callback.

It shouldn't do any harm to call three times in the suspend sequence 
instead of two.

> 
> Alex
> 
>> If so, maybe I'll just have both prepare() and suspend() set it universally and
>> both resume() and complete() clear it universally.
>>
>>
>>>
>>> Alex
>>>
>>>>         /* Evict the majority of BOs before grabbing the full access */
>>>>         r = amdgpu_device_evict_resources(adev);
>>>>         if (r)
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>>>> index e3471293846f..4c6fb852516a 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>>>> @@ -2425,8 +2425,9 @@ static int amdgpu_pmops_prepare(struct device
>>>> *dev)
>>>>         /* Return a positive number here so
>>>>          * DPM_FLAG_SMART_SUSPEND works properly
>>>>          */
>>>> -     if (amdgpu_device_supports_boco(drm_dev))
>>>> -             return pm_runtime_suspended(dev);
>>>> +     if (amdgpu_device_supports_boco(drm_dev) &&
>>>> +         pm_runtime_suspended(dev))
>>>> +             return 1;
>>>>
>>>>         /* if we will not support s3 or s2i for the device
>>>>          *  then skip suspend
>>>> @@ -2435,12 +2436,14 @@ static int amdgpu_pmops_prepare(struct
>> device
>>>> *dev)
>>>>             !amdgpu_acpi_is_s3_active(adev))
>>>>                 return 1;
>>>>
>>>> -     return 0;
>>>> +     return amdgpu_device_prepare(drm_dev);
>>>>    }
>>>>
>>>>    static void amdgpu_pmops_complete(struct device *dev)  {
>>>> -     /* nothing to do */
>>>> +     struct drm_device *drm_dev = dev_get_drvdata(dev);
>>>> +
>>>> +     amdgpu_device_complete(drm_dev);
>>>>    }
>>>>
>>>>    static int amdgpu_pmops_suspend(struct device *dev)
>>>> --
>>>> 2.34.1
>>>
> 



More information about the amd-gfx mailing list