[Bug 217141] New: [amdgpu] ring gfx_0.0.0 timeout steam deck AMD APU
bugzilla-daemon at kernel.org
bugzilla-daemon at kernel.org
Sun Mar 5 15:32:25 UTC 2023
https://bugzilla.kernel.org/show_bug.cgi?id=217141
Bug ID: 217141
Summary: [amdgpu] ring gfx_0.0.0 timeout steam deck AMD APU
Product: Drivers
Version: 2.5
Kernel Version: 6.1.12
Hardware: AMD
OS: Linux
Tree: Mainline
Status: NEW
Severity: high
Priority: P1
Component: Video(DRI - non Intel)
Assignee: drivers_video-dri at kernel-bugs.osdl.org
Reporter: serg at podtynnyi.com
Regression: No
[ 257.182206] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx_0.0.0
timeout, signaled seq=26043, emitted[64/36172]
[ 257.182668] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information:
process NMS.exe pid 2571 thread NMS.exe
pid 2571
[ 257.183084] amdgpu 0000:04:00.0: amdgpu: GPU reset begin!
[ 257.183094] ------------[ cut here ]------------
[ 257.183095] Evicting all processes
[ 257.183151] WARNING: CPU: 6 PID: 745 at
drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_process.c:1935 kfd_suspend_all_proc
esses+0x100/0x110 [amdgpu]
[ 257.183562] Modules linked in: uinput snd_seq_dummy snd_hrtimer snd_seq
snd_seq_device ccm algif_aead cbc des_generi
c libdes ecb md4 cmac algif_hash algif_skcipher af_alg bnep ramoops
reed_solomon snd_acp5x_pcm_dma snd_soc_acp5x_mach s
nd_acp5x_i2s snd_sof_amd_rembrandt rtw88_8822ce snd_sof_amd_renoir rtw88_8822c
snd_sof_amd_acp rtw88_pci intel_rapl_msr
snd_sof_pci intel_rapl_common rtw88_core snd_sof edac_mce_amd snd_sof_utils
btusb kvm_amd btrtl snd_pci_ps mac80211 sn
d_hda_codec_hdmi btbcm snd_soc_cs35l41_spi btintel kvm snd_soc_cs35l41
snd_rpl_pci_acp6x snd_hda_intel btmtk snd_soc_wm
_adsp snd_intel_dspcfg cs_dsp snd_acp_pci libarc4 leds_steamdeck
extcon_steamdeck snd_pci_acp6x snd_intel_sdw_acpi snd_
soc_nau8821 snd_soc_cs35l41_lib steamdeck_hwmon irqbypass bluetooth
snd_hda_codec snd_pci_acp5x snd_soc_core rapl snd_r
n_pci_acp3x cfg80211 pcspkr snd_hda_core snd_compress i2c_piix4 mousedev
cdc_acm ac97_bus snd_acp_config joydev ecdh_ge
neric snd_pcm_dmaengine snd_hwdep snd_soc_acpi
[ 257.183627] snd_pci_acp3x snd_pcm dwc3_pci rfkill ina2xx_adc kfifo_buf
snd_timer opt3001 ltrf216a steamdeck spi_amd
ina2xx industrialio snd acpi_cpufreq mac_hid soundcore fuse ip_tables x_tables
overlay ext4 crc16 mbcache jbd2 hid_ste
am usbhid amdgpu vfat fat gpu_sched drm_buddy serio_raw sdhci_pci nvme_tcp
drm_display_helper atkbd cqhci libps2 nvme_f
abrics crct10dif_pclmul vivaldi_fmap crc32_pclmul polyval_clmulni sdhci
polyval_generic cec i8042 gf128mul nvme hid_mul
titouch drm_ttm_helper ghash_clmulni_intel xhci_pci sha512_ssse3 nvme_core
aesni_intel crypto_simd sp5100_tco cryptd wd
at_wdt ttm xhci_pci_renesas ccp mmc_core nvme_common serio video i2c_hid_acpi
wmi 8250_dw i2c_hid btrfs blake2b_generic
xor raid6_pq libcrc32c crc32c_generic crc32c_intel dm_mirror dm_region_hash
dm_log dm_mod pkcs8_key_parser crypto_user
[ 257.183700] CPU: 6 PID: 745 Comm: kworker/u32:7 Not tainted
6.1.12-valve2-1-neptune-61 #1 4091faa51bd1be3bbac5fd4c3c
e3432202f24d92
[ 257.183704] Hardware name: Valve Jupiter/Jupiter, BIOS F7A0113 11/04/2022
[ 257.183708] Workqueue: amdgpu-reset-dev drm_sched_job_timedout [gpu_sched]
[ 257.183718] RIP: 0010:kfd_suspend_all_processes+0x100/0x110 [amdgpu]
[ 257.184119] Code: c7 c7 00 b3 3f c1 41 5c 41 5d e9 cb 4f 5f f1 be 03 00 00
00 e8 d1 89 a3 f1 e9 59 ff ff ff 48 c7 c7
14 a2 24 c1 e8 12 d6 06 f2 <0f> 0b e9 24 ff ff ff 0f 0b eb c5 0f 1f 44 00 00
66 0f 1f 00 0f 1f
[ 257.184122] RSP: 0018:ffffad1140f67cf8 EFLAGS: 00010286
[ 257.184125] RAX: 0000000000000000 RBX: ffff993b46b68400 RCX:
0000000000000027
[ 257.184127] RDX: ffff993e6eda0728 RSI: 0000000000000001 RDI:
ffff993e6eda0720
[ 257.184128] RBP: ffff993b44620000 R08: 0000000000000000 R09:
ffffad1140f67b78
[ 257.184130] R10: 0000000000000003 R11: ffff993e7ef7ffe8 R12:
ffffad1140f67dd0
[ 257.184131] R13: 0000000000000000 R14: ffff993b89dbe400 R15:
0000000000000000
[ 257.184133] FS: 0000000000000000(0000) GS:ffff993e6ed80000(0000)
knlGS:0000000000000000
[ 257.184135] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 257.184137] CR2: 000055d62521f000 CR3: 0000000108b04000 CR4:
0000000000350ee0
[ 257.184139] Call Trace:
[ 257.184143] <TASK>
[ 257.184147] kgd2kfd_suspend.part.0+0x3d/0x40 [amdgpu
ad613437896db6c29581f2be9152cc5a6dd35ad7]
[ 257.184571] kgd2kfd_pre_reset+0x47/0x60 [amdgpu
ad613437896db6c29581f2be9152cc5a6dd35ad7]
[ 257.184965] amdgpu_device_gpu_recover.cold+0x119/0xb40 [amdgpu
ad613437896db6c29581f2be9152cc5a6dd35ad7]
[ 257.185430] amdgpu_job_timedout+0x1dc/0x220 [amdgpu
ad613437896db6c29581f2be9152cc5a6dd35ad7]
[ 257.185866] ? try_to_wake_up+0xd9/0x560
[ 257.185874] drm_sched_job_timedout+0x7a/0x110 [gpu_sched
32db77b2b4e1fdeaf45e32d64ce206e5c0ca90ae]
[ 257.185885] process_one_work+0x1c7/0x380
[ 257.185892] worker_thread+0x51/0x390
[ 257.185897] ? rescuer_thread+0x3b0/0x3b0
[ 257.185901] kthread+0xde/0x110
[ 257.185905] ? kthread_complete_and_exit+0x20/0x20
[ 257.185909] ret_from_fork+0x22/0x30
[ 257.185917] </TASK>
[ 257.185918] ---[ end trace 0000000000000000 ]---
[ 257.284610] amdgpu 0000:04:00.0: amdgpu: MODE2 reset
[ 257.294783] amdgpu 0000:04:00.0: amdgpu: GPU reset succeeded, trying to
resume
cat /proc/cmdline
BOOT_IMAGE=/boot/vmlinuz-linux-neptune-61 console=tty1 rd.luks=0 rd.lvm=0
rd.md=0 rd.dm=0 rd.systemd.gpt_auto=no amdgpu.noretry=0
amdgpu.ppfeaturemask=0xffffbfff amdgpu.lockup_timeout=20000
amdgpu.job_hang_limit=2 drm.debug=0x1ff amdgpu.debug_evictions=true1
tsc=directsync module_blacklist=tpm log_buf_len=4M amd_iommu=off
amdgpu.gttsize=8128 spi_amd.speed_dev=1 audit=0 fbcon=rotate:1 loglevel=3
splash quiet plymouth.ignore-serial-consoles fbcon=vc:4-6
steamos.efi=PARTUUID=8bdf3e52-bf2f-7c45-9f00-45e568aa5af0
Linux Thorax 6.1.12-valve2-1-neptune-61 #1 SMP PREEMPT_DYNAMIC Mon, 27 Feb 2023
21:06:42 +0000 x86_64 GNU/Linux
Devices:
========
GPU0:
apiVersion = 4206830 (1.3.238)
driverVersion = 96469091 (0x5c00063)
vendorID = 0x1002
deviceID = 0x163f
deviceType = PHYSICAL_DEVICE_TYPE_INTEGRATED_GPU
deviceName = AMD Custom GPU 0405 (RADV VANGOGH)
driverID = DRIVER_ID_MESA_RADV
driverName = radv
driverInfo = Mesa 23.1.0-devel (git-16283f7b97)
conformanceVersion = 1.3.0.0
deviceUUID = 00000000-0400-0000-0000-000000000000
driverUUID = 414d442d-4d45-5341-2d44-525600000000
--
You may reply to this email to add a comment.
You are receiving this mail because:
You are watching the assignee of the bug.
More information about the dri-devel
mailing list