[PATCH v4 2/2] drm/amd: Add Suspend/Hibernate notification callback support

Mario Limonciello superm1 at kernel.org
Tue Nov 26 16:09:15 UTC 2024



On 11/26/24 09:37, Lazar, Lijo wrote:
> 
> 
> On 11/26/2024 7:50 PM, Mario Limonciello wrote:
>>>>> Based on https://gitlab.freedesktop.org/drm/amd/-/issues/3781
>>>>>
>>>>> What if this callback takes care only of suspend case and leaves the
>>>>> hibernate case to dpm_prepare callback?
>>>>
>>>> Then hibernate would fail under memory pressure.
>>>>
>>>> My take is this failure with hibernate is a userspace problem (whether
>>>> userspace decides to freeze the tasks or not).  I think it's better that
>>>> we /try/ to do the eviction and notify them if userspace should be
>>>> changed.
>>>>
>>>
>>> Hmm, the logic is kind of inconsistent now.
>>>
>>> For dGPUs, evict is required for s0ix, s3, s4 and attempted twice.
>>> For APUs, evict is required for s4, but attempted only once.
>>
>> Well the logic was inconsistent before already.
>>
>> But FWIW it will still be attempted a second time for APUs.
>> The call sequence is:
>>
>> 1) PM notifier
>> amdgpu_device_pm_notifier()
>> ->amdgpu_device_evict_resources()
>>
>> Eviction for dGPU and APU in S4.
>>
>> 2) Prepare callback
>> amdgpu_pmops_prepare()
>> ->amdgpu_device_prepare()
>> ->->amdgpu_device_evict_resources()
>>
>> Eviction for dGPU only.
>>
>> 3) Suspend callback
>> amdgpu_pmops_freeze()
>> ->amdgpu_device_suspend()
>> ->->amdgpu_device_evict_resources()
>>
>> Eviction for dGPU and APU in S4.
>>
>> So yes it's incongruent that prepare() doesn't include APU S4, but the
>> problem is you won't know at prepare() time whether it's S4 AFAICT.
>>
>> Or is there way to tell at prepare() for S4?
>>
> Now this notifier comes in before that and tells that right?
> 

Oh yeah, it does!

In that case we can set adev-in_s4 during the PM notifier and keep it 
set unless there is a failure during prepare().

> BTW, I didn't realize that there is a max of 3 attempts now (didn't
> notice it being called in suspend also).

Yeah, there are other reasons suspend callback gets called so it's 
better to keep that sequence identical.

> 
> Thanks,
> Lijo
> 
>>>
>>> Thanks,
>>> Lijo
>>>
>>>>>
>>>>> Thanks,
>>>>> Lijo
>>>>>> +    case PM_SUSPEND_PREPARE:
>>>>>> +        r = amdgpu_device_evict_resources(adev);
>>>>>> +        adev->in_s4 = false;
>>>>>> +        /*
>>>>>> +         * This is considered non-fatal at thie time because
>>>>>> +         * amdgpu_device_prepare() will also evict resources.
>>>>>> +         * See https://gitlab.freedesktop.org/drm/amd/-/issues/3781
>>>>>> +         */
>>>>>> +        if (r)
>>>>>> +            drm_warn(adev_to_drm(adev), "Failed to evict resources,
>>>>>> freeze active processes if problems occur\n");
>>>>>> +        break;
>>>>>> +    }
>>>>>> +
>>>>>> +    return NOTIFY_DONE;
>>>>>> +}
>>>>>> +
>>>>>>     /**
>>>>>>      * amdgpu_device_prepare - prepare for device suspend
>>>>>>      *
>>>>>
>>>>
>>>
>>
> 



More information about the amd-gfx mailing list