[PATCH v3 1/4] drm/amd: Add support for prepare() and complete() callbacks
Mario Limonciello
mario.limonciello at amd.com
Wed Oct 4 03:39:26 UTC 2023
On 10/3/2023 16:22, Deucher, Alexander wrote:
> [Public]
>
>> -----Original Message-----
>> From: Limonciello, Mario <Mario.Limonciello at amd.com>
>> Sent: Tuesday, October 3, 2023 5:17 PM
>> To: Deucher, Alexander <Alexander.Deucher at amd.com>; amd-
>> gfx at lists.freedesktop.org
>> Cc: Wentland, Harry <Harry.Wentland at amd.com>
>> Subject: Re: [PATCH v3 1/4] drm/amd: Add support for prepare() and
>> complete() callbacks
>>
>> On 10/3/2023 16:11, Deucher, Alexander wrote:
>>> [Public]
>>>
>>>> -----Original Message-----
>>>> From: amd-gfx <amd-gfx-bounces at lists.freedesktop.org> On Behalf Of
>>>> Mario Limonciello
>>>> Sent: Tuesday, October 3, 2023 4:55 PM
>>>> To: amd-gfx at lists.freedesktop.org
>>>> Cc: Wentland, Harry <Harry.Wentland at amd.com>; Limonciello, Mario
>>>> <Mario.Limonciello at amd.com>
>>>> Subject: [PATCH v3 1/4] drm/amd: Add support for prepare() and
>>>> complete() callbacks
>>>>
>>>> Linux PM core has a prepare() callback run before suspend and
>>>> complete() callback ran after resume() for devices to use. Add
>>>> plumbing to bring
>>>> prepare() to amdgpu.
>>>>
>>>> The idea with the new vfuncs for amdgpu is that all IP blocks that
>>>> memory allocations during suspend should do the allocation from this
>>>> call instead of the suspend() callback.
>>>>
>>>> By moving the allocations to prepare() the system suspend will be
>>>> failed before any IP block has done any suspend code.
>>>>
>>>> If the suspend fails, then do any cleanups in the complete() callback.
>>>>
>>>> Signed-off-by: Mario Limonciello <mario.limonciello at amd.com>
>>>> ---
>>>> drivers/gpu/drm/amd/amdgpu/amdgpu.h | 2 ++
>>>> drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 39
>>>> ++++++++++++++++++++--
>>>> drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 11 +++---
>>>> 3 files changed, 46 insertions(+), 6 deletions(-)
>>>>
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>>> index 73e825d20259..5d651552822c 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>>> @@ -1415,6 +1415,8 @@ void amdgpu_driver_postclose_kms(struct
>>>> drm_device *dev, void amdgpu_driver_release_kms(struct drm_device
>>>> *dev);
>>>>
>>>> int amdgpu_device_ip_suspend(struct amdgpu_device *adev);
>>>> +int amdgpu_device_prepare(struct drm_device *dev); void
>>>> +amdgpu_device_complete(struct drm_device *dev);
>>>> int amdgpu_device_suspend(struct drm_device *dev, bool fbcon); int
>>>> amdgpu_device_resume(struct drm_device *dev, bool fbcon);
>>>> u32 amdgpu_get_vblank_counter_kms(struct drm_crtc *crtc); diff
>>>> --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>> index bad2b5577e96..f53cf675c3ce 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>> @@ -4259,6 +4259,43 @@ static int
>>>> amdgpu_device_evict_resources(struct
>>>> amdgpu_device *adev)
>>>> /*
>>>> * Suspend & resume.
>>>> */
>>>> +/**
>>>> + * amdgpu_device_prepare - prepare for device suspend
>>>> + *
>>>> + * @dev: drm dev pointer
>>>> + *
>>>> + * Prepare to put the hw in the suspend state (all asics).
>>>> + * Returns 0 for success or an error on failure.
>>>> + * Called at driver suspend.
>>>> + */
>>>> +int amdgpu_device_prepare(struct drm_device *dev) {
>>>> + struct amdgpu_device *adev = drm_to_adev(dev);
>>>> + int r;
>>>> +
>>>> + if (dev->switch_power_state == DRM_SWITCH_POWER_OFF)
>>>> + return 0;
>>>> +
>>>> + adev->in_suspend = true;
>>>> +
>>>> + return 0;
>>>> +}
>>>> +
>>>> +/**
>>>> + * amdgpu_device_complete - complete the device after resume
>>>> + *
>>>> + * @dev: drm dev pointer
>>>> + *
>>>> + * Clean up any actions that the prepare step did.
>>>> + * Called after driver resume.
>>>> + */
>>>> +void amdgpu_device_complete(struct drm_device *dev) {
>>>> + struct amdgpu_device *adev = drm_to_adev(dev);
>>>> +
>>>> + adev->in_suspend = false;
>>>> +}
>>>> +
>>>> /**
>>>> * amdgpu_device_suspend - initiate device suspend
>>>> *
>>>> @@ -4277,8 +4314,6 @@ int amdgpu_device_suspend(struct drm_device
>>>> *dev, bool fbcon)
>>>> if (dev->switch_power_state == DRM_SWITCH_POWER_OFF)
>>>> return 0;
>>>>
>>>> - adev->in_suspend = true;
>>>> -
>>>
>>> We also set this to false in amdgpu_device_resume() so that should be fixed
>> up as well. But, I'm not sure we want to move this out of
>> amdgpu_device_suspend(). There are places we use
>> amdgpu_device_suspend/resume() outside of pmops that also rely on these
>> being set. Those places may need to be fixed up if we do. IIRC, the switcheroo
>> code uses this.
>>
>> The big reason that I moved it from suspend() to prepare() was so that
>> amdgpu_device_evict_resources() was called with the context of it being set.
>>
>> My thought process:
>> 0) prepare() sets all the time
>> 1) If prepare() fails complete() clears it.
>> 2) If prepare() succeeds it remains set for suspend()
>> 3) If suspend() succeeds it gets cleared at resume()
>> 4) If resume() failed for some reason, it's cleared by complete().
>>
>> Does it actually matter that it's set while evicting resources?
>
> Shouldn't matter for evicting resources. We even have debugfs nodes you can access to forcibly evict resources at runtime for testing memory pressure.
Then in that case I think what I'll do is put an extra call for
amdgpu_device_evict_resources() in the prepare callback.
It shouldn't do any harm to call three times in the suspend sequence
instead of two.
>
> Alex
>
>> If so, maybe I'll just have both prepare() and suspend() set it universally and
>> both resume() and complete() clear it universally.
>>
>>
>>>
>>> Alex
>>>
>>>> /* Evict the majority of BOs before grabbing the full access */
>>>> r = amdgpu_device_evict_resources(adev);
>>>> if (r)
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>>>> index e3471293846f..4c6fb852516a 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>>>> @@ -2425,8 +2425,9 @@ static int amdgpu_pmops_prepare(struct device
>>>> *dev)
>>>> /* Return a positive number here so
>>>> * DPM_FLAG_SMART_SUSPEND works properly
>>>> */
>>>> - if (amdgpu_device_supports_boco(drm_dev))
>>>> - return pm_runtime_suspended(dev);
>>>> + if (amdgpu_device_supports_boco(drm_dev) &&
>>>> + pm_runtime_suspended(dev))
>>>> + return 1;
>>>>
>>>> /* if we will not support s3 or s2i for the device
>>>> * then skip suspend
>>>> @@ -2435,12 +2436,14 @@ static int amdgpu_pmops_prepare(struct
>> device
>>>> *dev)
>>>> !amdgpu_acpi_is_s3_active(adev))
>>>> return 1;
>>>>
>>>> - return 0;
>>>> + return amdgpu_device_prepare(drm_dev);
>>>> }
>>>>
>>>> static void amdgpu_pmops_complete(struct device *dev) {
>>>> - /* nothing to do */
>>>> + struct drm_device *drm_dev = dev_get_drvdata(dev);
>>>> +
>>>> + amdgpu_device_complete(drm_dev);
>>>> }
>>>>
>>>> static int amdgpu_pmops_suspend(struct device *dev)
>>>> --
>>>> 2.34.1
>>>
>
More information about the amd-gfx
mailing list