amdgpu freezes kernel after kernel 5.7.6 changes
roland.rucky at gmail.com
roland.rucky at gmail.com
Sun Jun 28 10:27:35 UTC 2020
I narrowed it down even more. When reverting
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=b5232e2ee8df85891514c73472cac09921e5d51d everything is stable. I have an uptime of 4+ hours while actively
trying to reproduce the bug (gaming).
Here is a bug report:
https://gitlab.freedesktop.org/drm/amd/-/issues/1191
Added Nicholas Kazlauskas, author of the mentioned commit.
Am Samstag, den 27.06.2020, 18:03 +0200 schrieb roland.rucky at gmail.com:
> Not sure if I am contacting the correct person,
>
> Since I updated to kernel 5.7.6, my system started to freeze
> randomly.
> After a couple of freezes, I noticed, that they always happen when
> playing games, or during videoplayback in e.g. firefox.
>
> I reverted to the previous kernel 5.7.5, and all issues are gone.
> Next
> I started to revert and test single commits between the two kernel
> versions, which affect amdgpu. If I revert the changes listed below,
> the kernel does not freeze any more.
>
> Sadly I can`t get any crash reports / logs. Even the magic sysrq key
> does not work, when the system is frozen.
>
> I will also attach a patch, which includes all reverted commits.
>
>
> List of changes I reverted:
> -------------------------------------------------------------------
> ----
>
> commit 6674508ba1a2ea6caca5de2bcb25bc00a050fd0a
> Author: Harry Wentland <harry.wentland at amd.com>
> Date: Thu May 28 09:44:44 2020 -0400
>
> Revert "drm/amd/display: disable dcn20 abm feature for bring up"
>
> commit 14ed1c908a7a623cc0cbf0203f8201d1b7d31d16 upstream.
>
> This reverts commit 96cb7cf13d8530099c256c053648ad576588c387.
>
> This change was used for DCN2 bringup and is no longer desired.
> In fact it breaks backlight on DCN2 systems.
>
> Cc: Alexander Monakov <amonakov at ispras.ru>
> Cc: Hersen Wu <hersenxs.wu at amd.com>
> Cc: Anthony Koo <Anthony.Koo at amd.com>
> Cc: Michael Chiu <Michael.Chiu at amd.com>
> Signed-off-by: Harry Wentland <harry.wentland at amd.com>
> Acked-by: Alex Deucher <alexander.deucher at amd.com>
> Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas at amd.com>
> Reported-and-tested-by: Alexander Monakov <amonakov at ispras.ru>
> Signed-off-by: Alex Deucher <alexander.deucher at amd.com>
> Cc: stable at vger.kernel.org
> Signed-off-by: Greg Kroah-Hartman <gregkh at linuxfoundation.org>
>
> diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> index 7fc15b82fe48..f9f02e08054b 100644
> --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> @@ -1334,7 +1334,7 @@ static int dm_late_init(void *handle)
> unsigned int linear_lut[16];
> int i;
> struct dmcu *dmcu = adev->dm.dc->res_pool->dmcu;
> - bool ret = false;
> + bool ret;
>
> for (i = 0; i < 16; i++)
> linear_lut[i] = 0xFFFF * i / 15;
> @@ -1350,13 +1350,10 @@ static int dm_late_init(void *handle)
> */
> params.min_abm_backlight = 0x28F;
>
> - /* todo will enable for navi10 */
> - if (adev->asic_type <= CHIP_RAVEN) {
> - ret = dmcu_load_iram(dmcu, params);
> + ret = dmcu_load_iram(dmcu, params);
>
> - if (!ret)
> - return -EINVAL;
> - }
> + if (!ret)
> + return -EINVAL;
>
> return detect_mst_link_for_all_connectors(adev->ddev);
> }
>
> commit fba8f9ef7e1405ee6f422beb874791e8a5eb489c
> Author: Alex Deucher <alexander.deucher at amd.com>
> Date: Tue Jun 2 17:22:48 2020 -0400
>
> drm/amdgpu/display: use blanked rather than plane state for sync
> groups
>
> commit b7f839d292948142eaab77cedd031aad0bfec872 upstream.
>
> We may end up with no planes set yet, depending on the ordering,
> but we
> should have the proper blanking state which is either handled by
> either
> DPG or TG depending on the hardware generation. Check both to
> determine
> the proper blanked state.
>
> Bug: https://gitlab.freedesktop.org/drm/amd/issues/781
> Fixes: 5fc0cbfad45648 ("drm/amd/display: determine if a pipe is
> synced by plane state")
> Cc: nicholas.kazlauskas at amd.com
> Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas at amd.com>
> Signed-off-by: Alex Deucher <alexander.deucher at amd.com>
> Cc: stable at vger.kernel.org
> Signed-off-by: Greg Kroah-Hartman <gregkh at linuxfoundation.org>
>
> diff --git a/drivers/gpu/drm/amd/display/dc/core/dc.c
> b/drivers/gpu/drm/amd/display/dc/core/dc.c
> index 4a619328101c..4acaf4be8a81 100644
> --- a/drivers/gpu/drm/amd/display/dc/core/dc.c
> +++ b/drivers/gpu/drm/amd/display/dc/core/dc.c
> @@ -1011,9 +1011,17 @@ static void program_timing_sync(
> }
> }
>
> - /* set first pipe with plane as master */
> + /* set first unblanked pipe as master */
> for (j = 0; j < group_size; j++) {
> - if (pipe_set[j]->plane_state) {
> + bool is_blanked;
> +
> + if (pipe_set[j]->stream_res.opp->funcs-
> > dpg_is_blanked)
> + is_blanked =
> + pipe_set[j]->stream_res.opp-
> > funcs->dpg_is_blanked(pipe_set[j]->stream_res.opp);
> + else
> + is_blanked =
> + pipe_set[j]->stream_res.tg-
> > funcs->is_blanked(pipe_set[j]->stream_res.tg);
> + if (!is_blanked) {
> if (j == 0)
> break;
>
> @@ -1034,9 +1042,17 @@ static void program_timing_sync(
> status->timing_sync_info.master =
> false;
>
> }
> - /* remove any other pipes with plane as they have
> already been synced */
> + /* remove any other unblanked pipes as they have
> already been synced */
> for (j = j + 1; j < group_size; j++) {
> - if (pipe_set[j]->plane_state) {
> + bool is_blanked;
> +
> + if (pipe_set[j]->stream_res.opp->funcs-
> > dpg_is_blanked)
> + is_blanked =
> + pipe_set[j]->stream_res.opp-
> > funcs->dpg_is_blanked(pipe_set[j]->stream_res.opp);
> + else
> + is_blanked =
> + pipe_set[j]->stream_res.tg-
> > funcs->is_blanked(pipe_set[j]->stream_res.tg);
> + if (!is_blanked) {
> group_size--;
> pipe_set[j] = pipe_set[group_size];
> j--;
>
> commit b5232e2ee8df85891514c73472cac09921e5d51d
> Author: Nicholas Kazlauskas <nicholas.kazlauskas at amd.com>
> Date: Tue Jun 2 20:42:33 2020 -0400
>
> drm/amd/display: Revalidate bandwidth before commiting DC updates
>
> [ Upstream commit a24eaa5c51255b344d5a321f1eeb3205f2775498 ]
>
> [Why]
> Whenever we switch between tiled formats without also switching
> pixel
> formats or doing anything else that recreates the DC plane state
> we
> can run into underflow or hangs since we're not updating the
> DML parameters before committing to the hardware.
>
> [How]
> If the update type is FULL then call validate_bandwidth again to
> update
> the DML parmeters before committing the state.
>
> This is basically just a workaround and protective measure
> against
> update types being added DC where we could run into this issue in
> the future.
>
> We can only fully validate the state in advance before applying
> it
> to
> the hardware if we recreate all the plane and stream states since
> we can't modify what's currently in use.
>
> The next step is to update DM to ensure that we're creating the
> plane
> and stream states for whatever could potentially be a full update
> in
> DC to pre-emptively recreate the state for DC global validation.
>
> The workaround can stay until this has been fixed in DM.
>
> Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas at amd.com>
> Reviewed-by: Hersen Wu <hersenxs.wu at amd.com>
> Signed-off-by: Alex Deucher <alexander.deucher at amd.com>
> Signed-off-by: Sasha Levin <sashal at kernel.org>
>
> diff --git a/drivers/gpu/drm/amd/display/dc/core/dc.c
> b/drivers/gpu/drm/amd/display/dc/core/dc.c
> index 47431ca6986d..4a619328101c 100644
> --- a/drivers/gpu/drm/amd/display/dc/core/dc.c
> +++ b/drivers/gpu/drm/amd/display/dc/core/dc.c
> @@ -2517,6 +2517,12 @@ void dc_commit_updates_for_stream(struct dc
> *dc,
>
> copy_stream_update_to_stream(dc, context, stream,
> stream_update);
>
> + if (!dc->res_pool->funcs->validate_bandwidth(dc, context,
> false)) {
> + DC_ERROR("Mode validation failed for stream
> update!\n");
> + dc_release_state(context);
> + return;
> + }
> +
> commit_planes_for_stream(
> dc,
> srf_updates,
>
> Details:
>
> * Kernel: 5.7.6
> * GPU: radeon 5700XT
> * CPU: ryzen 3800X
> * running on swaywm(wayland)
More information about the dri-devel
mailing list