Regression in amdgpu driver / kernel v5.8.6
Alex Deucher
alexdeucher at gmail.com
Mon Sep 28 21:44:28 UTC 2020
On Mon, Sep 28, 2020 at 9:23 AM Jean Delvare <jdelvare at suse.de> wrote:
>
> Hi all,
>
> I have recently experienced a regression in stable kernel series 5.8.
> The problem is that my display will randomly turn to black after just a
> few seconds of inactivity. I have to switch to text console then back
> to X to recover.
>
> I bisected it down to:
>
> commit b86657e328b601a5b98f8c4c21d108d356dbceee
> Author: Navid Emamdoost <navid.emamdoost at gmail.com>
> Date: Sun Jun 14 02:09:44 2020 -0500
>
> drm/amdgpu: fix ref count leak in amdgpu_display_crtc_set_config
>
> [ Upstream commit e008fa6fb41544b63973a529b704ef342f47cc65 ]
>
> Reverting this commit on top of 5.8.10 makes the problem go away. My
> hardware setup:
>
> 01:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Baffin [Radeon RX 550 640SP / RX 560/560X] [1002:67ff] (rev ff)
> 01:00.1 Audio device [0403]: Advanced Micro Devices, Inc. [AMD/ATI] Baffin HDMI/DP Audio [Radeon RX 550 640SP / RX 560/560X] [1002:aae0]
>
> This is a Sapphire Pulse Radeon RX550 card, with two Lenovo P27h-10
> displays connected (one over DP, one over HDMI). I'm using option dc=0
> otherwise the multi-screen setup doesn't work.
>
> Looking at the patch, and at the logic of function
> amdgpu_display_crtc_set_config() in general, I suspect that the middle
> chunk of the patch is incorrect. Calling pm_runtime_put_autosuspend()
> if pm_runtime_get_sync() failed is, albeit surprising, correct due to
> how that function is implemented. Calling it right after
> "adev->have_disp_power_ref = true" however looks wrong. The comment
> right before that line pretty much implies that we *want* to keep the
> reference.
>
> So I think we want to apply a partial revert like the following:
Nice analysis. I've applied the patch.
Thanks!
Alex
>
> From: Jean Delvare <jdelvare at suse.de>
> Subject: drm/amdgpu: restore proper ref count in amdgpu_display_crtc_set_config
>
> A recent attempt to fix a ref count leak in
> amdgpu_display_crtc_set_config() turned out to be doing too much and
> "fixed" an intended decrease as if it were a leak. Undo that part to
> restore the proper balance. This is the very nature of this function
> to increase or decrease the power reference count depending on the
> situation.
>
> Consequences of this bug is that the power reference would
> eventually get down to 0 while the display was still in use,
> resulting in that display switching off unexpectedly.
>
> Signed-off-by: Jean Delvare <jdelvare at suse.de>
> Fixes: e008fa6fb415 ("drm/amdgpu: fix ref count leak in amdgpu_display_crtc_set_config")
> Cc: stable at vger.kernel.org
> Cc: Navid Emamdoost <navid.emamdoost at gmail.com>
> Cc: Alex Deucher <alexander.deucher at amd.com>
> ---
> drivers/gpu/drm/amd/amdgpu/amdgpu_display.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> --- linux.orig/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c 2020-09-28 10:54:12.634245251 +0200
> +++ linux/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c 2020-09-28 10:55:40.569906840 +0200
> @@ -297,7 +297,7 @@ int amdgpu_display_crtc_set_config(struc
> take the current one */
> if (active && !adev->have_disp_power_ref) {
> adev->have_disp_power_ref = true;
> - goto out;
> + return ret;
> }
> /* if we have no active crtcs, then drop the power ref
> we got before */
>
>
> --
> Jean Delvare
> SUSE L3 Support
> _______________________________________________
> amd-gfx mailing list
> amd-gfx at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
More information about the amd-gfx
mailing list