[PATCH] drm/amd/amdgpu: Revert "drm/amd/amdgpu: shorten the gfx idle worker timeout"
Feng, Kenneth
Kenneth.Feng at amd.com
Thu Mar 20 08:29:03 UTC 2025
[AMD Official Use Only - AMD Internal Distribution Only]
Hi Alex,
The call trace is generated when the gdm is launched, as below.
I tried running on a standalone workqueue but still see the workqueue is flushed.
Thanks.
[ 21.558439] ------------[ cut here ]------------
[ 21.558443] workqueue: WQ_MEM_RECLAIM gfx_0.0.0:drm_sched_run_job_work [amd_sched] is flushing !WQ_MEM_RECLAIM events:amdgpu_gfx_profile_idle_work_handler [amdgpu]
[ 21.558716] WARNING: CPU: 0 PID: 115 at kernel/workqueue.c:3706 check_flush_dependency+0x151/0x180
[ 21.558724] Modules linked in: snd_seq_dummy snd_hrtimer qrtr sunrpc amd_atl intel_rapl_msr intel_rapl_common snd_hda_codec_hdmi snd_hda_intel snd_intel_dspcfg edac_mce_amd snd_intel_sdw_acpi snd_usb_audio snd_hda_codec kvm_amd snd_usbmidi_lib snd_hda_core snd_ump mc snd_hwdep snd_pcm kvm snd_seq_midi snd_seq_midi_event crct10dif_pclmul snd_rawmidi polyval_clmulni polyval_generic ghash_clmulni_intel spd5118 sha256_ssse3 sha1_ssse3 snd_seq aesni_intel crypto_simd cryptd snd_seq_device snd_timer rapl wmi_bmof ccp snd i2c_piix4 k10temp i2c_smbus soundcore input_leds joydev gpio_amdpt mac_hid binfmt_misc sch_fq_codel msr parport_pc ppdev lp parport efi_pstore nfnetlink dmi_sysfs ip_tables x_tables autofs4 hid_generic usbhid hid amdgpu(OE) amddrm_ttm_helper(OE) amdttm(OE) amddrm_buddy(OE) amdxcp(OE) drm_exec drm_suballoc_helper amd_sched(OE) amdkcl(OE) drm_display_helper cec rc_core nvme i2c_algo_bit drm_ttm_helper crc32_pclmul r8169 xhci_pci nvme_core ahci ttm xhci_pci_renesas libahci realtek nvme_auth video wmi
[ 21.558817] CPU: 0 UID: 0 PID: 115 Comm: kworker/u64:1 Tainted: G OE 6.11.0-17-generic #17~24.04.2-Ubuntu
[ 21.558822] Tainted: [O]=OOT_MODULE, [E]=UNSIGNED_MODULE
[ 21.558823] Hardware name: Micro-Star International Co., Ltd. MS-7D76/MAG B650M MORTAR WIFI (MS-7D76), BIOS A.J0 12/17/2024
[ 21.558825] Workqueue: gfx_0.0.0 drm_sched_run_job_work [amd_sched]
[ 21.558830] RIP: 0010:check_flush_dependency+0x151/0x180
[ 21.558833] Code: 56 18 4d 89 e0 48 8d 8b c0 00 00 00 48 c7 c7 e8 88 09 a1 c6 05 e8 4d 8d 02 01 48 8b 70 08 48 81 c6 c0 00 00 00 e8 6f 54 fd ff <0f> 0b e9 d2 fe ff ff 44 0f b6 3d ca 4d 8d 02 41 80 ff 01 77 0f 41
[ 21.558836] RSP: 0018:ffffae930051fbe8 EFLAGS: 00010046
[ 21.558838] RAX: 0000000000000000 RBX: ffff9abf80201400 RCX: 0000000000000000
[ 21.558840] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
[ 21.558842] RBP: ffffae930051fc10 R08: 0000000000000000 R09: 0000000000000000
[ 21.558843] R10: 0000000000000000 R11: 0000000000000000 R12: ffffffffc0992ad0
[ 21.558844] R13: 0000000000000000 R14: ffff9abf8030d440 R15: ffffae930051fc40
[ 21.558846] FS: 0000000000000000(0000) GS:ffff9ace9d800000(0000) knlGS:0000000000000000
[ 21.558848] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 21.558850] CR2: 0000073bf2b6c000 CR3: 000000004623e000 CR4: 0000000000f50ef0
[ 21.558852] PKRU: 55555554
[ 21.558853] Call Trace:
[ 21.558855] <TASK>
[ 21.558859] ? show_regs+0x6c/0x80
[ 21.558864] ? __warn+0x88/0x140
[ 21.558867] ? check_flush_dependency+0x151/0x180
[ 21.558870] ? report_bug+0x182/0x1b0
[ 21.558875] ? handle_bug+0x6e/0xb0
[ 21.558880] ? exc_invalid_op+0x18/0x80
[ 21.558883] ? asm_exc_invalid_op+0x1b/0x20
[ 21.558888] ? __pfx_amdgpu_gfx_profile_idle_work_handler+0x10/0x10 [amdgpu]
[ 21.559113] ? check_flush_dependency+0x151/0x180
[ 21.559116] ? check_flush_dependency+0x151/0x180
[ 21.559120] __flush_work+0x238/0x310
[ 21.559124] ? __mod_timer+0x122/0x340
[ 21.559129] cancel_delayed_work_sync+0x76/0x80
[ 21.559133] amdgpu_gfx_profile_ring_begin_use+0x34/0xa0 [amdgpu]
[ 21.559341] gfx_v12_0_ring_begin_use+0x12/0x30 [amdgpu]
[ 21.559531] amdgpu_ring_alloc+0x40/0x70 [amdgpu]
[ 21.559675] amdgpu_ib_schedule+0x172/0x830 [amdgpu]
[ 21.559821] amdgpu_job_run+0x8d/0x200 [amdgpu]
[ 21.559994] drm_sched_run_job_work+0x2bb/0x450 [amd_sched]
[ 21.559997] process_one_work+0x178/0x3d0
[ 21.560000] worker_thread+0x2de/0x410
[ 21.560002] ? __pfx_worker_thread+0x10/0x10
[ 21.560004] kthread+0xe1/0x110
[ 21.560006] ? __pfx_kthread+0x10/0x10
[ 21.560008] ret_from_fork+0x44/0x70
[ 21.560010] ? __pfx_kthread+0x10/0x10
[ 21.560012] ret_from_fork_asm+0x1a/0x30
[ 21.560017] </TASK>
[ 21.560017] ---[ end trace 0000000000000000 ]---
-----Original Message-----
From: Alex Deucher <alexdeucher at gmail.com>
Sent: Wednesday, March 19, 2025 8:54 PM
To: Feng, Kenneth <Kenneth.Feng at amd.com>
Cc: amd-gfx at lists.freedesktop.org; Wang, Yang(Kevin) <KevinYang.Wang at amd.com>
Subject: Re: [PATCH] drm/amd/amdgpu: Revert "drm/amd/amdgpu: shorten the gfx idle worker timeout"
Caution: This message originated from an External Source. Use proper caution when opening attachments, clicking links, or responding.
On Wed, Mar 19, 2025 at 2:38 AM Kenneth Feng <kenneth.feng at amd.com> wrote:
>
> This reverts commit b00fb9765ea4b05198d67256118445c6f13f9ddf.
>
> Reason for revert: this causes some tests fail with call trace.
Do you have a copy of the call trace? I can't see how this would be an issue?
Alex
>
> Signed-off-by: Kenneth Feng <kenneth.feng at amd.com>
> ---
> drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h
> index a6d3a4554caa..75af4f25a133 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h
> @@ -57,8 +57,8 @@ enum amdgpu_gfx_pipe_priority { #define
> AMDGPU_GFX_QUEUE_PRIORITY_MINIMUM 0 #define
> AMDGPU_GFX_QUEUE_PRIORITY_MAXIMUM 15
>
> -/* 10 millisecond timeout */
> -#define GFX_PROFILE_IDLE_TIMEOUT msecs_to_jiffies(10)
> +/* 1 second timeout */
> +#define GFX_PROFILE_IDLE_TIMEOUT msecs_to_jiffies(1000)
>
> enum amdgpu_gfx_partition {
> AMDGPU_SPX_PARTITION_MODE = 0,
> --
> 2.34.1
>
More information about the amd-gfx
mailing list