[PATCH 0/3] Handle aborted suspend better

Chris Bainbridge chris.bainbridge at gmail.com
Mon Jun 2 12:22:40 UTC 2025


On Sun, Jun 01, 2025 at 08:44:29PM -0500, Mario Limonciello wrote:
> From: Mario Limonciello <mario.limonciello at amd.com>
> 
> Chris Bainbridge reported some list corruption occurring around the
> suspend sequence when an aborted suspend occurs.
> 
> I couldn't reproduce this specific problem, but when I tried I found
> some other issues where the cached DM state isn't properly destroyed.
> 
> This is because there isn't a complete() callback to match the prepare()
> callback used by amdgpu. Normally the PM core will call complete() after
> every suspend attempt (succesful or not).
> 
> Link: https://gitlab.freedesktop.org/drm/amd/-/issues/4280
> 
> Mario Limonciello (3):
>   drm/amd: Add support for a complete pmops action
>   drm/amd/display: Stop storing failures into adev->dm.cached_state
>   drm/amd/display: Destroy cached state in complete() callback
> 
>  drivers/gpu/drm/amd/amdgpu/amdgpu.h           |   1 +
>  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c    |  22 +++
>  drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c       |   2 +-
>  .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 125 +++++++++++-------
>  drivers/gpu/drm/amd/include/amd_shared.h      |   1 +
>  5 files changed, 103 insertions(+), 48 deletions(-)
> 
> -- 
> 2.43.0
> 

I tested with 30 suspends and the dm_prepare_suspend /
amdgpu_device_prepare error did not appear. The list corruption error
remain but that bisects to:

aa7a9275ab81 ("PM: sleep: Suspend async parents after suspending children").

I applied your patch series to the parent of that commit, tested, and
there were no errors. So this issue looks fixed but the other issue
remains, email sent: https://lore.kernel.org/all/aD2U3VIhf8vDkl09@debian.local/


More information about the amd-gfx mailing list