[Bug 105733] Amdgpu randomly hangs and only ssh works. Mouse cursor moves sometimes but does nothing. Keyboard stops working.

bugzilla-daemon at freedesktop.org bugzilla-daemon at freedesktop.org
Sat Feb 23 12:14:25 UTC 2019


https://bugs.freedesktop.org/show_bug.cgi?id=105733

--- Comment #73 from Mauro Gaspari <ilvipero at gmx.com> ---
This problem affects me as well. It has for quite some time. 
My setup: 
CPU AMD Ryzen 7 2700X
RAM 64GB DDR4 3200
GPU AMD Vega RX 64

Since this issue has plagued me for quite a while, I tried to even install
windows10, and I can confirm there are no issues at all. Having said that AMD
drivers were quite bad at Vega launch on windows too. 

In my experience the bug comes and goes together with mesa versions being used,
or combination of kernel plus mesa. I can reproduce the issue easily by playing
some games.Some extra tests I ran to make sure it was not hardware issue or
game issue:
- Same games work fine on windows on same hardware, same bios settings, etc.
- Same games work fine on my Nvidia+Intel based laptop, running same linux
distributions and kernels.

For example for me kubuntu 18.04.01 Using AMDGPU opensource drivers was ok
without the bug for a very long time. Then, a couple of weeks ago mesa update
came and i started having the freeze again. 
I tried to upgrade to 18.10 and I still had the freeze. Added oibaf PPA, and
the issue was gone. after a few weeks an update came and issue started
happening again. I am now using padoka PPA but still having the freeze.
Same problem happens for me also on OpenSUSE Tumbleweed and Arch on same
machine. 

I tried disabling compositor, disablign vsync, changing compositor on my KDE
Plasma, running game in windowed mode vs full screen. Nothing helped.

Also please note that before upgrading my CPU and Motherboard, I was running
Vega RX64 on an Intel CPU, and I had the same issues.

Some info I saved a while back when running on OpenSUSE Tumbleweed below. If
needed I can grab more recent logs and system info and post.
I am also going to try and install kubuntu 18.04.1 with AMDGPU-PRO proprietary
drivers to see if there is any difference.


---First time i noticed the issue:

OS: OpenSUSE tumbleweed x86_64 updated (2018 04 21)
Kernel: 4.16.2-1-default
Desktop Environment: KDE Plasma (x11)
OpenGL version string: 3.0 Mesa 18.0.0
GPU: AMD Radeon RX Vega 64 8GB

Apr 21 17:08:34 STUDIO kernel: [drm:gfx_v9_0_priv_reg_irq [amdgpu]] *ERROR*
Illegal register access in command stream
Apr 21 17:08:34 STUDIO kernel: [drm] No hardware hang detected. Did some blocks
stall?
Apr 21 17:08:44 STUDIO kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring
gfx timeout, last signaled seq=128859, last emitted seq=128861
Apr 21 17:08:44 STUDIO kernel: [drm] No hardware hang detected. Did some blocks
stall?
-- Reboot --


Dmesg lines relative to amdgpu:

[    3.407020] [drm] amdgpu kernel modesetting enabled.
[    3.411462] fb: switching to amdgpudrmfb from VESA VGA
[    3.426163] amdgpu 0000:04:00.0: Invalid PCI ROM header signature: expecting
0xaa55, got 0xffff
[    3.426261] amdgpu 0000:04:00.0: VRAM: 8176M 0x000000F400000000 -
0x000000F5FEFFFFFF (8176M used)
[    3.426263] amdgpu 0000:04:00.0: GTT: 256M 0x000000F600000000 -
0x000000F60FFFFFFF
[    3.426371] [drm] amdgpu: 8176M of VRAM memory ready
[    3.426372] [drm] amdgpu: 8176M of GTT memory ready.
[    4.031665] fbcon: amdgpudrmfb (fb0) is primary device
[    4.083803] amdgpu 0000:04:00.0: fb0: amdgpudrmfb frame buffer device
[    4.096086] amdgpu 0000:04:00.0: ring 0(gfx) uses VM inv eng 4 on hub 0
[    4.096088] amdgpu 0000:04:00.0: ring 1(comp_1.0.0) uses VM inv eng 5 on hub
0
[    4.096089] amdgpu 0000:04:00.0: ring 2(comp_1.1.0) uses VM inv eng 6 on hub
0
[    4.096090] amdgpu 0000:04:00.0: ring 3(comp_1.2.0) uses VM inv eng 7 on hub
0
[    4.096091] amdgpu 0000:04:00.0: ring 4(comp_1.3.0) uses VM inv eng 8 on hub
0
[    4.096093] amdgpu 0000:04:00.0: ring 5(comp_1.0.1) uses VM inv eng 9 on hub
0
[    4.096094] amdgpu 0000:04:00.0: ring 6(comp_1.1.1) uses VM inv eng 10 on
hub 0
[    4.096095] amdgpu 0000:04:00.0: ring 7(comp_1.2.1) uses VM inv eng 11 on
hub 0
[    4.096096] amdgpu 0000:04:00.0: ring 8(comp_1.3.1) uses VM inv eng 12 on
hub 0
[    4.096098] amdgpu 0000:04:00.0: ring 9(kiq_2.1.0) uses VM inv eng 13 on hub
0
[    4.096099] amdgpu 0000:04:00.0: ring 10(sdma0) uses VM inv eng 4 on hub 1
[    4.096100] amdgpu 0000:04:00.0: ring 11(sdma1) uses VM inv eng 5 on hub 1
[    4.096101] amdgpu 0000:04:00.0: ring 12(uvd) uses VM inv eng 6 on hub 1
[    4.096103] amdgpu 0000:04:00.0: ring 13(uvd_enc0) uses VM inv eng 7 on hub
1
[    4.096104] amdgpu 0000:04:00.0: ring 14(uvd_enc1) uses VM inv eng 8 on hub
1
[    4.096105] amdgpu 0000:04:00.0: ring 15(vce0) uses VM inv eng 9 on hub 1
[    4.096107] amdgpu 0000:04:00.0: ring 16(vce1) uses VM inv eng 10 on hub 1
[    4.096108] amdgpu 0000:04:00.0: ring 17(vce2) uses VM inv eng 11 on hub 1
[    4.096662] [drm] Initialized amdgpu 3.23.0 20150101 for 0000:04:00.0 on
minor 0


---It was identified to be this bug
https://bugs.freedesktop.org/show_bug.cgi?id=105317 . After I upgraded
Tumbleweed to mesa 18.0.1 the issue was gone.


--- Later on I had the same bug again.
OS: OpenSUSE tumbleweed x86_64 updated (2018 08 10)
Kernel: 4.17.2-1-default
Desktop Environment: KDE Plasma (x11)
OpenGL version string: 3.1 Mesa 18.1.5
GPU: AMD Radeon RX Vega 64 8GB


Relevant log lines I found during freeze:

2018-08-09T23:16:53.103775+08:00 MGDT-Tumbleweed kernel: [ 6305.852703]
[drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx timeout, last signaled
seq=1745163, last emitted seq=
1745165
2018-08-09T23:16:53.103795+08:00 MGDT-Tumbleweed kernel: [ 6305.852704] [drm]
No hardware hang detected. Did some blocks stall?


Dmesg lines relative to amdgpu:

[    3.130759] [drm] amdgpu kernel modesetting enabled.
[    3.135770] fb: switching to amdgpudrmfb from EFI VGA
[    3.136106] amdgpu 0000:03:00.0: Invalid PCI ROM header signature: expecting
0xaa55, got 0xffff
[    3.136171] amdgpu 0000:03:00.0: VRAM: 8176M 0x000000F400000000 -
0x000000F5FEFFFFFF (8176M used)
[    3.136173] amdgpu 0000:03:00.0: GTT: 512M 0x000000F600000000 -
0x000000F61FFFFFFF
[    3.136494] [drm] amdgpu: 8176M of VRAM memory ready
[    3.136495] [drm] amdgpu: 8176M of GTT memory ready.
[    4.114469] fbcon: amdgpudrmfb (fb0) is primary device
[    4.141179] amdgpu 0000:03:00.0: fb0: amdgpudrmfb frame buffer device
[    4.164072] amdgpu 0000:03:00.0: ring 0(gfx) uses VM inv eng 4 on hub 0
[    4.164074] amdgpu 0000:03:00.0: ring 1(comp_1.0.0) uses VM inv eng 5 on hub
0
[    4.164075] amdgpu 0000:03:00.0: ring 2(comp_1.1.0) uses VM inv eng 6 on hub
0
[    4.164075] amdgpu 0000:03:00.0: ring 3(comp_1.2.0) uses VM inv eng 7 on hub
0
[    4.164076] amdgpu 0000:03:00.0: ring 4(comp_1.3.0) uses VM inv eng 8 on hub
0
[    4.164077] amdgpu 0000:03:00.0: ring 5(comp_1.0.1) uses VM inv eng 9 on hub
0
[    4.164078] amdgpu 0000:03:00.0: ring 6(comp_1.1.1) uses VM inv eng 10 on
hub 0
[    4.164079] amdgpu 0000:03:00.0: ring 7(comp_1.2.1) uses VM inv eng 11 on
hub 0
[    4.164079] amdgpu 0000:03:00.0: ring 8(comp_1.3.1) uses VM inv eng 12 on
hub 0
[    4.164080] amdgpu 0000:03:00.0: ring 9(kiq_2.1.0) uses VM inv eng 13 on hub
0
[    4.164081] amdgpu 0000:03:00.0: ring 10(sdma0) uses VM inv eng 4 on hub 1
[    4.164082] amdgpu 0000:03:00.0: ring 11(sdma1) uses VM inv eng 5 on hub 1
[    4.164083] amdgpu 0000:03:00.0: ring 12(uvd) uses VM inv eng 6 on hub 1
[    4.164084] amdgpu 0000:03:00.0: ring 13(uvd_enc0) uses VM inv eng 7 on hub
1
[    4.164085] amdgpu 0000:03:00.0: ring 14(uvd_enc1) uses VM inv eng 8 on hub
1
[    4.164085] amdgpu 0000:03:00.0: ring 15(vce0) uses VM inv eng 9 on hub 1
[    4.164086] amdgpu 0000:03:00.0: ring 16(vce1) uses VM inv eng 10 on hub 1
[    4.164087] amdgpu 0000:03:00.0: ring 17(vce2) uses VM inv eng 11 on hub 1
[    4.164553] [drm] Initialized amdgpu 3.25.0 20150101 for 0000:03:00.0 on
minor 0

-- 
You are receiving this mail because:
You are the assignee for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/dri-devel/attachments/20190223/cec4f2cc/attachment.html>


More information about the dri-devel mailing list