[PATCH] amdgpu: disable amdgpu_dpm on THTF-SW831-1W-DS25_MB board
Mario Limonciello
mario.limonciello at amd.com
Wed Aug 28 15:19:48 UTC 2024
On 8/28/2024 05:59, WangYuli wrote:
> From: wenlunpeng <wenlunpeng at uniontech.com>
>
> The quirk is for reboot-stability.
>
> A device reboot stress test has been observed to cause
> random system hangs when amdgpu_dpm is enabled.
>
> Disabling amdgpu_dpm can fix this.
>
> However, a boot-param can still overwrite it to enable
> amdgpu_dpm.
>
> Serial log when error occurs:
> ...
> Console: switching to colour frame buffer device 160x45
> amdgpu 0000:01:00.0: fb0: amdgpudrmfb frame buffer device
> [drm:amdgpu_device_ip_late_init] *ERROR* late_init of IP block <si_dpm> failed -22
> amdgpu 0000:01:00.0: amdgpu_device_ip_late_init failed
> amdgpu 0000:01:00.0: Fatal error during GPU init
> [drm] amdgpu: finishing device.
> Console: switching to colour dummy device 80x25
> ...
This is production hardware?
Have you already checked whether a BIOS upgrade for the device could
help this issue?
>
> Signed-off-by: wenlunpeng <wenlunpeng at uniontech.com>
> Signed-off-by: WangYuli <wangyuli at uniontech.com>
Just to clarify did you guys co-work on this patch, or are you
submitting on behalf of wenlunpeng? It right now shows up as you
submitting on behalf of wenlunpeng. If you co-worked on it you should
also use a Co-Developed-by tag.
> ---
> drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 23 +++++++++++++++++++++++
> 1 file changed, 23 insertions(+)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> index 094498a0964b..81716fcac7cd 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> @@ -32,6 +32,7 @@
> #include <drm/drm_vblank.h>
>
> #include <linux/cc_platform.h>
> +#include <linux/dmi.h>
> #include <linux/dynamic_debug.h>
> #include <linux/module.h>
> #include <linux/mmu_notifier.h>
> @@ -3023,10 +3024,32 @@ static struct pci_driver amdgpu_kms_pci_driver = {
> .dev_groups = amdgpu_sysfs_groups,
> };
>
> +static int quirk_set_amdgpu_dpm_0(const struct dmi_system_id *dmi)
> +{
> + amdgpu_dpm = 0;
> + pr_info("Identified '%s', set amdgpu_dpm to 0.\n", dmi->ident);
> + return 1;
> +}
> +
> +static const struct dmi_system_id amdgpu_quirklist[] = {
> + {
> + .ident = "DS25 Desktop",
> + .matches = {
> + DMI_MATCH(DMI_BOARD_NAME, "THTF-SW831-1W-DS25_MB"),
As this is suspected to be a BIOS issue, I would like to better
understand if the BIOS upgrade fixes it. If it does but you would still
like a quirk for the system it should include the BIOS version here.
> + },
> + .callback = quirk_set_amdgpu_dpm_0,
> + },
> + {}
> +};
> +
> static int __init amdgpu_init(void)
> {
> int r;
>
> + /* quirks for some hardware, applied only when it's untouched */
> + if (amdgpu_dpm == -1)
> + dmi_check_system(amdgpu_quirklist);
> +
> if (drm_firmware_drivers_only())
> return -EINVAL;
>
More information about the dri-devel
mailing list