[PATCH 2/2] drm/amdgpu: fix pm notifier handling
Mario Limonciello
mario.limonciello at amd.com
Fri May 2 20:37:06 UTC 2025
On 5/2/2025 3:32 PM, Alex Deucher wrote:
> On Fri, May 2, 2025 at 3:39 PM Mario Limonciello
> <mario.limonciello at amd.com> wrote:
>>
>> On 5/1/2025 3:09 PM, Alex Deucher wrote:
>>> Set the s3/s0ix and s4 flags in the pm notifier so that we can skip
>>> the resource evictions properly in pm prepare based on whether
>>> we are suspending or hibernating. Drop the eviction as processes
>>> are not frozen at this time, we we can end up getting stuck trying
>>> to evict VRAM while applications continue to submit work which
>>> causes the buffers to get pulled back into VRAM.
>>>
>>> Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4178
>>> Fixes: 2965e6355dcd ("drm/amd: Add Suspend/Hibernate notification callback support")
>>> Cc: Mario Limonciello <mario.limonciello at amd.com>
>>> Signed-off-by: Alex Deucher <alexander.deucher at amd.com>
>>> ---
>>> drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 25 +++++++++++-----------
>>> drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 22 ++-----------------
>>> 2 files changed, 15 insertions(+), 32 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>> index 71d95f16067a4..d232e4a26d7bf 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>> @@ -4974,28 +4974,29 @@ static int amdgpu_device_evict_resources(struct amdgpu_device *adev)
>>> * @data: data
>>> *
>>> * This function is called when the system is about to suspend or hibernate.
>>> - * It is used to evict resources from the device before the system goes to
>>> - * sleep while there is still access to swap.
>>> + * It is used to set the appropriate flags so that eviction can be optimized
>>> + * in the pm prepare callback.
>>> */
>>> static int amdgpu_device_pm_notifier(struct notifier_block *nb, unsigned long mode,
>>> void *data)
>>> {
>>> struct amdgpu_device *adev = container_of(nb, struct amdgpu_device, pm_nb);
>>> - int r;
>>>
>>> switch (mode) {
>>> case PM_HIBERNATION_PREPARE:
>>> adev->in_s4 = true;
>>> - fallthrough;
>>> + break;
>>> + case PM_POST_HIBERNATION:
>>> + adev->in_s4 = false;
>>> + break;
>>> case PM_SUSPEND_PREPARE:
>>> - r = amdgpu_device_evict_resources(adev);
>>> - /*
>>> - * This is considered non-fatal at this time because
>>> - * amdgpu_device_prepare() will also fatally evict resources.
>>> - * See https://gitlab.freedesktop.org/drm/amd/-/issues/3781
>>> - */
>>> - if (r)
>>> - drm_warn(adev_to_drm(adev), "Failed to evict resources, freeze active processes if problems occur: %d\n", r);
>>> + if (amdgpu_acpi_is_s0ix_active(adev))
>>
>> I don't believe this is valid "this early".
>>
>> pm_suspend()
>> ->enter_state()
>> ->->suspend_prepare()
>> ->->-> Call notification chains for PM_SUSPEND_PREPARE
>> ->->suspend_devices_and_enter()
>> ->->-> Set pm_suspend_target_state
>
> hmmm. Is there a way to determine whether we are going into hibernate
> vs. suspend in the pm prepare function? I guess we could set
> adev->in_s4 here and then check if in_s4 is set in pm prepare, and if
> not, then call this logic to set the suspend flags in the prepare
> callback.
>
Yeah; I think setting is_s4 here makes a lot of sense and then use that
as a hint later in the sequence.
More information about the amd-gfx
mailing list