Regression in amdgpu driver / kernel v5.8.6

Alex Deucher alexdeucher at gmail.com
Mon Sep 28 21:44:28 UTC 2020


On Mon, Sep 28, 2020 at 9:23 AM Jean Delvare <jdelvare at suse.de> wrote:
>
> Hi all,
>
> I have recently experienced a regression in stable kernel series 5.8.
> The problem is that my display will randomly turn to black after just a
> few seconds of inactivity. I have to switch to text console then back
> to X to recover.
>
> I bisected it down to:
>
> commit b86657e328b601a5b98f8c4c21d108d356dbceee
> Author: Navid Emamdoost <navid.emamdoost at gmail.com>
> Date:   Sun Jun 14 02:09:44 2020 -0500
>
>     drm/amdgpu: fix ref count leak in amdgpu_display_crtc_set_config
>
>     [ Upstream commit e008fa6fb41544b63973a529b704ef342f47cc65 ]
>
> Reverting this commit on top of 5.8.10 makes the problem go away. My
> hardware setup:
>
> 01:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Baffin [Radeon RX 550 640SP / RX 560/560X] [1002:67ff] (rev ff)
> 01:00.1 Audio device [0403]: Advanced Micro Devices, Inc. [AMD/ATI] Baffin HDMI/DP Audio [Radeon RX 550 640SP / RX 560/560X] [1002:aae0]
>
> This is a Sapphire Pulse Radeon RX550 card, with two Lenovo P27h-10
> displays connected (one over DP, one over HDMI). I'm using option dc=0
> otherwise the multi-screen setup doesn't work.
>
> Looking at the patch, and at the logic of function
> amdgpu_display_crtc_set_config() in general, I suspect that the middle
> chunk of the patch is incorrect. Calling pm_runtime_put_autosuspend()
> if pm_runtime_get_sync() failed is, albeit surprising, correct due to
> how that function is implemented. Calling it right after
> "adev->have_disp_power_ref = true" however looks wrong. The comment
> right before that line pretty much implies that we *want* to keep the
> reference.
>
> So I think we want to apply a partial revert like the following:

Nice analysis.  I've applied the patch.

Thanks!

Alex


>
> From: Jean Delvare <jdelvare at suse.de>
> Subject: drm/amdgpu: restore proper ref count in amdgpu_display_crtc_set_config
>
> A recent attempt to fix a ref count leak in
> amdgpu_display_crtc_set_config() turned out to be doing too much and
> "fixed" an intended decrease as if it were a leak. Undo that part to
> restore the proper balance. This is the very nature of this function
> to increase or decrease the power reference count depending on the
> situation.
>
> Consequences of this bug is that the power reference would
> eventually get down to 0 while the display was still in use,
> resulting in that display switching off unexpectedly.
>
> Signed-off-by: Jean Delvare <jdelvare at suse.de>
> Fixes: e008fa6fb415 ("drm/amdgpu: fix ref count leak in amdgpu_display_crtc_set_config")
> Cc: stable at vger.kernel.org
> Cc: Navid Emamdoost <navid.emamdoost at gmail.com>
> Cc: Alex Deucher <alexander.deucher at amd.com>
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_display.c |    2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> --- linux.orig/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c      2020-09-28 10:54:12.634245251 +0200
> +++ linux/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c   2020-09-28 10:55:40.569906840 +0200
> @@ -297,7 +297,7 @@ int amdgpu_display_crtc_set_config(struc
>            take the current one */
>         if (active && !adev->have_disp_power_ref) {
>                 adev->have_disp_power_ref = true;
> -               goto out;
> +               return ret;
>         }
>         /* if we have no active crtcs, then drop the power ref
>            we got before */
>
>
> --
> Jean Delvare
> SUSE L3 Support
> _______________________________________________
> amd-gfx mailing list
> amd-gfx at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx


More information about the amd-gfx mailing list