[PATCH 0/3] Handle aborted suspend better
Chris Bainbridge
chris.bainbridge at gmail.com
Mon Jun 2 12:22:40 UTC 2025
On Sun, Jun 01, 2025 at 08:44:29PM -0500, Mario Limonciello wrote:
> From: Mario Limonciello <mario.limonciello at amd.com>
>
> Chris Bainbridge reported some list corruption occurring around the
> suspend sequence when an aborted suspend occurs.
>
> I couldn't reproduce this specific problem, but when I tried I found
> some other issues where the cached DM state isn't properly destroyed.
>
> This is because there isn't a complete() callback to match the prepare()
> callback used by amdgpu. Normally the PM core will call complete() after
> every suspend attempt (succesful or not).
>
> Link: https://gitlab.freedesktop.org/drm/amd/-/issues/4280
>
> Mario Limonciello (3):
> drm/amd: Add support for a complete pmops action
> drm/amd/display: Stop storing failures into adev->dm.cached_state
> drm/amd/display: Destroy cached state in complete() callback
>
> drivers/gpu/drm/amd/amdgpu/amdgpu.h | 1 +
> drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 22 +++
> drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 2 +-
> .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 125 +++++++++++-------
> drivers/gpu/drm/amd/include/amd_shared.h | 1 +
> 5 files changed, 103 insertions(+), 48 deletions(-)
>
> --
> 2.43.0
>
I tested with 30 suspends and the dm_prepare_suspend /
amdgpu_device_prepare error did not appear. The list corruption error
remain but that bisects to:
aa7a9275ab81 ("PM: sleep: Suspend async parents after suspending children").
I applied your patch series to the parent of that commit, tested, and
there were no errors. So this issue looks fixed but the other issue
remains, email sent: https://lore.kernel.org/all/aD2U3VIhf8vDkl09@debian.local/
More information about the amd-gfx
mailing list