[PATCH 2/2] drm/amdgpu: fix pm notifier handling

Mario Limonciello mario.limonciello at amd.com
Fri May 2 20:37:06 UTC 2025


On 5/2/2025 3:32 PM, Alex Deucher wrote:
> On Fri, May 2, 2025 at 3:39 PM Mario Limonciello
> <mario.limonciello at amd.com> wrote:
>>
>> On 5/1/2025 3:09 PM, Alex Deucher wrote:
>>> Set the s3/s0ix and s4 flags in the pm notifier so that we can skip
>>> the resource evictions properly in pm prepare based on whether
>>> we are suspending or hibernating.  Drop the eviction as processes
>>> are not frozen at this time, we we can end up getting stuck trying
>>> to evict VRAM while applications continue to submit work which
>>> causes the buffers to get pulled back into VRAM.
>>>
>>> Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4178
>>> Fixes: 2965e6355dcd ("drm/amd: Add Suspend/Hibernate notification callback support")
>>> Cc: Mario Limonciello <mario.limonciello at amd.com>
>>> Signed-off-by: Alex Deucher <alexander.deucher at amd.com>
>>> ---
>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 25 +++++++++++-----------
>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c    | 22 ++-----------------
>>>    2 files changed, 15 insertions(+), 32 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>> index 71d95f16067a4..d232e4a26d7bf 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>> @@ -4974,28 +4974,29 @@ static int amdgpu_device_evict_resources(struct amdgpu_device *adev)
>>>     * @data: data
>>>     *
>>>     * This function is called when the system is about to suspend or hibernate.
>>> - * It is used to evict resources from the device before the system goes to
>>> - * sleep while there is still access to swap.
>>> + * It is used to set the appropriate flags so that eviction can be optimized
>>> + * in the pm prepare callback.
>>>     */
>>>    static int amdgpu_device_pm_notifier(struct notifier_block *nb, unsigned long mode,
>>>                                     void *data)
>>>    {
>>>        struct amdgpu_device *adev = container_of(nb, struct amdgpu_device, pm_nb);
>>> -     int r;
>>>
>>>        switch (mode) {
>>>        case PM_HIBERNATION_PREPARE:
>>>                adev->in_s4 = true;
>>> -             fallthrough;
>>> +             break;
>>> +     case PM_POST_HIBERNATION:
>>> +             adev->in_s4 = false;
>>> +             break;
>>>        case PM_SUSPEND_PREPARE:
>>> -             r = amdgpu_device_evict_resources(adev);
>>> -             /*
>>> -              * This is considered non-fatal at this time because
>>> -              * amdgpu_device_prepare() will also fatally evict resources.
>>> -              * See https://gitlab.freedesktop.org/drm/amd/-/issues/3781
>>> -              */
>>> -             if (r)
>>> -                     drm_warn(adev_to_drm(adev), "Failed to evict resources, freeze active processes if problems occur: %d\n", r);
>>> +             if (amdgpu_acpi_is_s0ix_active(adev))
>>
>> I don't believe this is valid "this early".
>>
>> pm_suspend()
>> ->enter_state()
>> ->->suspend_prepare()
>> ->->-> Call notification chains for PM_SUSPEND_PREPARE
>> ->->suspend_devices_and_enter()
>> ->->-> Set pm_suspend_target_state
> 
> hmmm.  Is there a way to determine whether we are going into hibernate
> vs. suspend in the pm prepare function?  I guess we could set
> adev->in_s4 here and then check if in_s4 is set in pm prepare, and if
> not, then call this logic to set the suspend flags in the prepare
> callback.
> 

Yeah; I think setting is_s4 here makes a lot of sense and then use that 
as a hint later in the sequence.




More information about the amd-gfx mailing list