[PATCH v4 1/3] drm/amd: Evict resources during PM ops prepare() callback
Mario Limonciello
mario.limonciello at amd.com
Thu Oct 5 14:59:33 UTC 2023
On 10/5/2023 09:39, Christian König wrote:
> Am 04.10.23 um 19:18 schrieb Mario Limonciello:
>> Linux PM core has a prepare() callback run before suspend.
>>
>> If the system is under high memory pressure, the resources may need
>> to be evicted into swap instead. If the storage backing for swap
>> is offlined during the suspend() step then such a call may fail.
>>
>> So duplicate this step into prepare() to move evict majority of
>> resources while leaving all existing steps that put the GPU into a
>> low power state in suspend().
>>
>> Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2362
>> Signed-off-by: Mario Limonciello <mario.limonciello at amd.com>
>> ---
>> drivers/gpu/drm/amd/amdgpu/amdgpu.h | 1 +
>> drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 26 +++++++++++++++++++++-
>> drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 7 +++---
>> 3 files changed, 30 insertions(+), 4 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>> index d23fb4b5ad95..6643d0ed6b1b 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>> @@ -1413,6 +1413,7 @@ void amdgpu_driver_postclose_kms(struct
>> drm_device *dev,
>> void amdgpu_driver_release_kms(struct drm_device *dev);
>> int amdgpu_device_ip_suspend(struct amdgpu_device *adev);
>> +int amdgpu_device_prepare(struct drm_device *dev);
>> int amdgpu_device_suspend(struct drm_device *dev, bool fbcon);
>> int amdgpu_device_resume(struct drm_device *dev, bool fbcon);
>> u32 amdgpu_get_vblank_counter_kms(struct drm_crtc *crtc);
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> index bad2b5577e96..67acee569c08 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> @@ -4259,6 +4259,31 @@ static int amdgpu_device_evict_resources(struct
>> amdgpu_device *adev)
>> /*
>> * Suspend & resume.
>> */
>> +/**
>> + * amdgpu_device_prepare - prepare for device suspend
>> + *
>> + * @dev: drm dev pointer
>> + *
>> + * Prepare to put the hw in the suspend state (all asics).
>> + * Returns 0 for success or an error on failure.
>> + * Called at driver suspend.
>> + */
>> +int amdgpu_device_prepare(struct drm_device *dev)
>> +{
>> + struct amdgpu_device *adev = drm_to_adev(dev);
>> + int r;
>> +
>> + if (dev->switch_power_state == DRM_SWITCH_POWER_OFF)
>> + return 0;
>> +
>> + /* Evict the majority of BOs before starting suspend sequence */
>> + r = amdgpu_device_evict_resources(adev);
>> + if (r)
>> + return r;
>> +
>> + return 0;
>> +}
>> +
>> /**
>> * amdgpu_device_suspend - initiate device suspend
>> *
>> @@ -4279,7 +4304,6 @@ int amdgpu_device_suspend(struct drm_device
>> *dev, bool fbcon)
>> adev->in_suspend = true;
>> - /* Evict the majority of BOs before grabbing the full access */
>> r = amdgpu_device_evict_resources(adev);
>> if (r)
>> return r;
>
> I would just completely drop this extra amdgpu_device_evict_resources()
> call now.
>
> We have a second call which is used to evacuate firmware etc... after
> the hw has been shut down. That one can't move, but also shouldn't
> allocate that much memory.
>
The problem is that amdgpu_device_suspend() is also called from
amdgpu_switcheroo_set_state() as well as a bunch of pmops sequences that
I don't expect call prepare() like poweroff().
I would think we still want to evict resources at the beginning of
amdgpu_device_suspend() for all of those.
So it's an extra call for the prepare() path but it should be harmless.
> Regards,
> Christian.
>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>> index e3471293846f..175167582db0 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>> @@ -2425,8 +2425,9 @@ static int amdgpu_pmops_prepare(struct device *dev)
>> /* Return a positive number here so
>> * DPM_FLAG_SMART_SUSPEND works properly
>> */
>> - if (amdgpu_device_supports_boco(drm_dev))
>> - return pm_runtime_suspended(dev);
>> + if (amdgpu_device_supports_boco(drm_dev) &&
>> + pm_runtime_suspended(dev))
>> + return 1;
>> /* if we will not support s3 or s2i for the device
>> * then skip suspend
>> @@ -2435,7 +2436,7 @@ static int amdgpu_pmops_prepare(struct device *dev)
>> !amdgpu_acpi_is_s3_active(adev))
>> return 1;
>> - return 0;
>> + return amdgpu_device_prepare(drm_dev);
>> }
>> static void amdgpu_pmops_complete(struct device *dev)
>
More information about the amd-gfx
mailing list