[Bug 216645] New: Fence fallback timer expired on ring gfx
bugzilla-daemon at kernel.org
bugzilla-daemon at kernel.org
Mon Oct 31 13:22:31 UTC 2022
https://bugzilla.kernel.org/show_bug.cgi?id=216645
Bug ID: 216645
Summary: Fence fallback timer expired on ring gfx
Product: Drivers
Version: 2.5
Kernel Version: 5.15.0-43-generic
Hardware: All
OS: Linux
Tree: Mainline
Status: NEW
Severity: normal
Priority: P1
Component: Video(DRI - non Intel)
Assignee: drivers_video-dri at kernel-bugs.osdl.org
Reporter: ask4support at email.cz
Regression: No
Created attachment 303109
--> https://bugzilla.kernel.org/attachment.cgi?id=303109&action=edit
Kernel log created by the script in the menuetry
Sometimes when I run a KDE system monitor, or Chrome, my laptop freezes and
won't unfreeze until reboot (well, after a while I can move the mouse cursor,
but that's all I can do).
I'm using Dell G5 SE 5505 with AMD Ryzen 7 4800H as a CPU, Radeon RX Vega 7 as
iGPU and AMD Radeon RX 5600M as dGPU.
I've searched through existing bugs and found that it might be related to
interrupts. With that in mind, I've compiled a list of kernel parameters which
might be related and, as well as that, I've tested all of them:
PW = Probably Working, NW = Not Working, NB = Not Booting
PW pcie_port_pm=off
PW amdgpu.msi=0
NW amd_iommu=fullflush
NW amd_iommu=force_isolation
NW amd_iommu=off
NW amd_iommu_intr=legacy
NW amd_iommu_intr=vapic kvm-amd.avic=1
NW iommu=off
NW iommu=force
NW iommu=noforce
NW iommu=biomerge
NW iommu=merge
NW iommu=nomerge
NW iommu=forcesac
NW iommu=soft
NW iommu=pt
NW irqfixup
NW irqpoll
NW nointremap
NW pcie_port_pm=force
NW amdgpu.pcie_gen2=1
NW amdgpu.pcie_gen2=0
NW amdgpu.msi=1
NW amdgpu.lockup_timeout=1000
NW amdgpu.lockup_timeout=100
NW amdgpu.aspm=1
NW amdgpu.aspm=0
NW amdgpu.bapm=1
NW amdgpu.bapm=0
NW amdgpu.ppfeaturemask=0xfff7bff7
NW amdgpu.ppfeaturemask=0xfff7bdff
NW amdgpu.ppfeaturemask=0xfff7bbff
NW amdgpu.ppfeaturemask=0xfff73fff
NW amdgpu.ppfeaturemask=0xfff3bfff
NW amdgpu.exp_hw_support=1
NW amdgpu.exp_hw_support=0
NW amdgpu.forcelongtraining=0
NW amdgpu.forcelongtraining=1
NW amdgpu.cg_mask=0x00000000
NW amdgpu.cg_mask=0xffffffff
NW amdgpu.pg_mask=0xffffffff
NW amdgpu.ngg=1
NW amdgpu.ngg=0
NW amdgpu.job_hang_limit=1000
NW amdgpu.job_hang_limit=100
NW amdgpu.lbpw=1
NW amdgpu.lbpw=0
NW amdgpu.gpu_recovery=1
NW amdgpu.gpu_recovery=0
NW amdgpu.sched_policy=2
NW amdgpu.sched_policy=1
NW amdgpu.sched_policy=0
NW amdgpu.ignore_crat=0
NW amdgpu.ignore_crat=1
NW amdgpu.ras_enable=0
NW amdgpu.ras_enable=1
NW amdgpu.async_gfx_ring=0
NW amdgpu.async_gfx_ring=1
NW amdgpu.mcbp=1
NW amdgpu.mcbp=0
NW amdgpu.mes=0
NW amdgpu.mes_kiq=1
NW amdgpu.mes_kiq=0
NW amdgpu.reset_method=0
NW amdgpu.reset_method=1
NW amdgpu.reset_method=2
NW amdgpu.reset_method=3
NW amdgpu.reset_method=4
NW amdgpu.reset_method=-1
NW idle=nomwait
NB amdgpu.pg_mask=0x00000000
NB amdgpu.mes=1
I've developed a script and a GRUB2 menu entry for live Kubuntu that triggers
the freeze and saves the dmesg into a file called Freeze_Dell_G5_SE_5505.sh.log
at the root of the drive it's being booted from.
Replace the ISO variable value with the path to your iso file if it's not at
root directory of the drive and/or if it's of a different version:
menuentry "Start Kubuntu 22.04.1 (64 bit) without Ubiquity and with a freezing
script" {
ISO=/kubuntu-22.04.1-desktop-amd64.iso
set gfxpayload=keep
loopback loop "$ISO"
probe -u $root --set=rootid
linux (loop)/casper/vmlinuz iso-scan/filename="$ISO"
file=/cdrom/preseed/kubuntu.seed maybe-ubiquity quiet splash init=/bin/sh -- -c
'for script in /home/kubuntu/Desktop/Freeze_Dell_G5_SE_5505.sh ; do for autorun
in /home/kubuntu/.config/autostart/${script##*/} ; do ln -fs /dev/null
/etc/systemd/system/graphical.target.wants/ubiquity.service ; mkdir -p
${script%/*} ${autorun%/*} ; printf
\043!_/bin/sh++print\050\051_{+\tprintf_"@1"_,_seq_-s"_"_@\050\050_@\050stty_size_\074_ at t_?_sed_"s/^/\050/,_s/_/_-_1_\051_*_/"\051_-_@{\0431}_\051\051_?_sed_s/[0-9]//g+}+t\075"@\050readlink_/proc/self/fd/0\051"++d\075"@\050env_LANG\075C_udisksctl_mount_-b_/dev/disk/by-uuid/$0_-o_sync_2\076_/dev/null_?_sed_"s/^Mounted_.*_at_//g,_s/\\.@//g"\051"+[_-d_"@d"_]_\046\046_f\075oflag\075direct_??_d\075"@{0%%/*}"+sudo_dmesg_-w_?_sudo_dd_of\075"@d/@{0\043\043*/}.log"_ at f_\046+i\0750+seq_28_150000_?_while_read_N_,_do+\tprint_ at N+\ttimeout_3_env_DISPLAY\075:0_plasma-systemmonitor_\076_/dev/null_2\076\0461+\tn\075 at N_,_while_[_0_-lt_ at n_]_,_do+\t\tsleep_1+\t\tn\075@\050\050_ at n_-_1_\051\051+\t\ti\075@\050\050_ at i_^_1_\051\051+\t\t[_"@i"_\075_1_]_\046\046_printf_"\\33[30m\\33[47m"_??_printf_"\\33[37m\\33[40m"+\t\tprint_ at n+\tdone+done++echo_END!+exit+
| tr _,?@+ \40\73\174\044\n > $script ; printf
[Desktop_Entry]\nType=Application\nExec=kstart_--maximize_--_konsole_-e_ | tr
_ \40 > ${autorun%.sh}.desktop ; printf $script\n >> ${autorun%.sh}.desktop ;
chmod +x $script ${autorun%.sh}.desktop ; chown -R kubuntu:kubuntu
/home/kubuntu ; exec /sbin/init maybe-ubiquity splash --- ; done ; done'
$rootid
initrd (loop)/casper/initrd
}
The script generated on the live Kubuntu's desktop runs KDE's System Monitor
for a three seconds and waits before running it again. With each iteration, it
waits one second longer than before. The parameter passed the test if it
managed not to freeze until the script was waiting for 50 seconds (now I'd
recommend 60, as with 50 it sometimes froze after the second boot) for five
boots in a row.
Would someone also tell us which workaround should be used under which
performace/latency requirements? ("Maybe wrong but still an" EXAMPLE: Users who
need the best performace or lowest latency should use pcie_port_pm=off, users
who need the best battery life should use amdgpu.msi=0.)
If you fix the issue, may you please tell the users (not just developers) what
was the problem? ("Maybe wrong but still an" EXAMPLE: The driver was waiting
for an interrupt, but the bus was down, therefore the message-signalled
interrupt could not have come and the operation timed out.)
Thanks.
--
You may reply to this email to add a comment.
You are receiving this mail because:
You are watching the assignee of the bug.
More information about the dri-devel
mailing list