[PATCH v4 2/2] drm/amd: Add Suspend/Hibernate notification callback support

Tue Nov 26 12:26:32 UTC 2024

On 11/25/2024 9:46 PM, Mario Limonciello wrote:
> 
> 
> On 11/25/24 08:59, Lazar, Lijo wrote:
>>
>>
>> On 11/25/2024 12:30 AM, Mario Limonciello wrote:
>>> From: Mario Limonciello <mario.limonciello at amd.com>
>>>
>>> As part of the suspend sequence VRAM needs to be evicted on dGPUs.
>>> In order to make suspend/resume more reliable we moved this into
>>> the pmops prepare() callback so that the suspend sequence would fail
>>> but the system could remain operational under high memory usage suspend.
>>>
>>> Another class of issues exist though where due to memory fragementation
>>> there isn't a large enough contiguous space and swap isn't accessible.
>>>
>>> Add support for a suspend/hibernate notification callback that could
>>> evict VRAM before tasks are frozen. This should allow paging out to swap
>>> if necessary.
>>>
>>> Link: https://github.com/ROCm/ROCK-Kernel-Driver/issues/174
>>> Link: https://gitlab.freedesktop.org/drm/amd/-/issues/3476
>>> Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/2362
>>> Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3781
>>> Signed-off-by: Mario Limonciello <mario.limonciello at amd.com>
>>> ---
>>> v4:
>>>   * Make non fatal, drop patch 3
>>> ---
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu.h        |  1 +
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 45 ++++++++++++++++++++++
>>>   2 files changed, 46 insertions(+)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/
>>> amd/amdgpu/amdgpu.h
>>> index d8bc6da500161..79ec4ab8ecfc5 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>> @@ -879,6 +879,7 @@ struct amdgpu_device {
>>>       bool                need_swiotlb;
>>>       bool                accel_working;
>>>       struct notifier_block        acpi_nb;
>>> +    struct notifier_block        pm_nb;
>>>       struct amdgpu_i2c_chan        *i2c_bus[AMDGPU_MAX_I2C_BUS];
>>>       struct debugfs_blob_wrapper     debugfs_vbios_blob;
>>>       struct debugfs_blob_wrapper     debugfs_discovery_blob;
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/
>>> gpu/drm/amd/amdgpu/amdgpu_device.c
>>> index 996e9c78384dd..56510ab4b6650 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>> @@ -191,6 +191,8 @@ void amdgpu_set_init_level(struct amdgpu_device
>>> *adev,
>>>   }
>>>     static inline void amdgpu_device_stop_pending_resets(struct
>>> amdgpu_device *adev);
>>> +static int amdgpu_device_pm_notifier(struct notifier_block *nb,
>>> unsigned long mode,
>>> +                     void *data);
>>>     /**
>>>    * DOC: pcie_replay_count
>>> @@ -4553,6 +4555,11 @@ int amdgpu_device_init(struct amdgpu_device
>>> *adev,
>>>         amdgpu_device_check_iommu_direct_map(adev);
>>>   +    adev->pm_nb.notifier_call = amdgpu_device_pm_notifier;
>>> +    r = register_pm_notifier(&adev->pm_nb);
>>> +    if (r)
>>> +        goto failed;
>>> +
>>>       return 0;
>>>     release_ras_con:
>>> @@ -4617,6 +4624,8 @@ void amdgpu_device_fini_hw(struct amdgpu_device
>>> *adev)
>>>           drain_workqueue(adev->mman.bdev.wq);
>>>       adev->shutdown = true;
>>>   +    unregister_pm_notifier(&adev->pm_nb);
>>> +
>>>       /* make sure IB test finished before entering exclusive mode
>>>        * to avoid preemption on IB test
>>>        */
>>> @@ -4748,6 +4757,42 @@ static int
>>> amdgpu_device_evict_resources(struct amdgpu_device *adev)
>>>   /*
>>>    * Suspend & resume.
>>>    */
>>> +/**
>>> + * amdgpu_device_pm_notifier - Notification block for Suspend/
>>> Hibernate events
>>> + * @nb: notifier block
>>> + * @mode: suspend mode
>>> + * @data: data
>>> + *
>>> + * This function is called when the system is about to suspend or
>>> hibernate.
>>> + * It is used to evict resources from the device before the system
>>> goes to
>>> + * sleep while there is still access to swap.
>>> + */
>>> +static int amdgpu_device_pm_notifier(struct notifier_block *nb,
>>> unsigned long mode,
>>> +                     void *data)
>>> +{
>>> +    struct amdgpu_device *adev = container_of(nb, struct
>>> amdgpu_device, pm_nb);
>>> +    int r;
>>> +
>>> +    switch (mode) {
>>> +    case PM_HIBERNATION_PREPARE:
>>> +        adev->in_s4 = true;
>>> +        fallthrough;
>>
>> Based on https://gitlab.freedesktop.org/drm/amd/-/issues/3781
>>
>> What if this callback takes care only of suspend case and leaves the
>> hibernate case to dpm_prepare callback?
> 
> Then hibernate would fail under memory pressure.
> 
> My take is this failure with hibernate is a userspace problem (whether
> userspace decides to freeze the tasks or not).  I think it's better that
> we /try/ to do the eviction and notify them if userspace should be changed.
> 

Hmm, the logic is kind of inconsistent now.

For dGPUs, evict is required for s0ix, s3, s4 and attempted twice.
For APUs, evict is required for s4, but attempted only once.

Thanks,
Lijo

>>
>> Thanks,
>> Lijo
>>> +    case PM_SUSPEND_PREPARE:
>>> +        r = amdgpu_device_evict_resources(adev);
>>> +        adev->in_s4 = false;
>>> +        /*
>>> +         * This is considered non-fatal at thie time because
>>> +         * amdgpu_device_prepare() will also evict resources.
>>> +         * See https://gitlab.freedesktop.org/drm/amd/-/issues/3781
>>> +         */
>>> +        if (r)
>>> +            drm_warn(adev_to_drm(adev), "Failed to evict resources,
>>> freeze active processes if problems occur\n");
>>> +        break;
>>> +    }
>>> +
>>> +    return NOTIFY_DONE;
>>> +}
>>> +
>>>   /**
>>>    * amdgpu_device_prepare - prepare for device suspend
>>>    *
>>
>