[PATCH v4 2/2] drm/amd: Add Suspend/Hibernate notification callback support
Mario Limonciello
superm1 at kernel.org
Tue Nov 26 16:09:15 UTC 2024
On 11/26/24 09:37, Lazar, Lijo wrote:
>
>
> On 11/26/2024 7:50 PM, Mario Limonciello wrote:
>>>>> Based on https://gitlab.freedesktop.org/drm/amd/-/issues/3781
>>>>>
>>>>> What if this callback takes care only of suspend case and leaves the
>>>>> hibernate case to dpm_prepare callback?
>>>>
>>>> Then hibernate would fail under memory pressure.
>>>>
>>>> My take is this failure with hibernate is a userspace problem (whether
>>>> userspace decides to freeze the tasks or not). I think it's better that
>>>> we /try/ to do the eviction and notify them if userspace should be
>>>> changed.
>>>>
>>>
>>> Hmm, the logic is kind of inconsistent now.
>>>
>>> For dGPUs, evict is required for s0ix, s3, s4 and attempted twice.
>>> For APUs, evict is required for s4, but attempted only once.
>>
>> Well the logic was inconsistent before already.
>>
>> But FWIW it will still be attempted a second time for APUs.
>> The call sequence is:
>>
>> 1) PM notifier
>> amdgpu_device_pm_notifier()
>> ->amdgpu_device_evict_resources()
>>
>> Eviction for dGPU and APU in S4.
>>
>> 2) Prepare callback
>> amdgpu_pmops_prepare()
>> ->amdgpu_device_prepare()
>> ->->amdgpu_device_evict_resources()
>>
>> Eviction for dGPU only.
>>
>> 3) Suspend callback
>> amdgpu_pmops_freeze()
>> ->amdgpu_device_suspend()
>> ->->amdgpu_device_evict_resources()
>>
>> Eviction for dGPU and APU in S4.
>>
>> So yes it's incongruent that prepare() doesn't include APU S4, but the
>> problem is you won't know at prepare() time whether it's S4 AFAICT.
>>
>> Or is there way to tell at prepare() for S4?
>>
> Now this notifier comes in before that and tells that right?
>
Oh yeah, it does!
In that case we can set adev-in_s4 during the PM notifier and keep it
set unless there is a failure during prepare().
> BTW, I didn't realize that there is a max of 3 attempts now (didn't
> notice it being called in suspend also).
Yeah, there are other reasons suspend callback gets called so it's
better to keep that sequence identical.
>
> Thanks,
> Lijo
>
>>>
>>> Thanks,
>>> Lijo
>>>
>>>>>
>>>>> Thanks,
>>>>> Lijo
>>>>>> + case PM_SUSPEND_PREPARE:
>>>>>> + r = amdgpu_device_evict_resources(adev);
>>>>>> + adev->in_s4 = false;
>>>>>> + /*
>>>>>> + * This is considered non-fatal at thie time because
>>>>>> + * amdgpu_device_prepare() will also evict resources.
>>>>>> + * See https://gitlab.freedesktop.org/drm/amd/-/issues/3781
>>>>>> + */
>>>>>> + if (r)
>>>>>> + drm_warn(adev_to_drm(adev), "Failed to evict resources,
>>>>>> freeze active processes if problems occur\n");
>>>>>> + break;
>>>>>> + }
>>>>>> +
>>>>>> + return NOTIFY_DONE;
>>>>>> +}
>>>>>> +
>>>>>> /**
>>>>>> * amdgpu_device_prepare - prepare for device suspend
>>>>>> *
>>>>>
>>>>
>>>
>>
>
More information about the amd-gfx
mailing list