[Bug 110822] booting with kernel version 5.1.0 or higher on RX 580 hangs

bugzilla-daemon at freedesktop.org bugzilla-daemon at freedesktop.org
Mon Jun 3 09:37:22 UTC 2019


https://bugs.freedesktop.org/show_bug.cgi?id=110822

            Bug ID: 110822
           Summary: booting with kernel version 5.1.0 or higher on RX 580
                    hangs
           Product: DRI
           Version: unspecified
          Hardware: x86-64 (AMD64)
                OS: Linux (All)
            Status: NEW
          Severity: blocker
          Priority: medium
         Component: DRM/AMDgpu
          Assignee: dri-devel at lists.freedesktop.org
          Reporter: gobinda.joy at gmail.com

Created attachment 144420
  --> https://bugs.freedesktop.org/attachment.cgi?id=144420&action=edit
Linux version 5.1.6-350.vanilla.knurd.1.fc30.x86_64

My hardware is as follows:
CPU: i7 3770 at stock clock
Motherboard: Gigabyte G1.Sniper 3 latest BIOS available
RAM: 24 GB DDR3 at 1600 mhz
GPU: RX 580 8GB (Sapphire) latest VBIOS

The problem is with kernel 5.1.0 or higher (currently 5.1.6) Display hangs when
amdgpu driver loads. I'm unable to determine if the booting is continued or
hangs as well. Disk activity stops after couple seconds and not possible to
switch TTY.
Ctrl+Alt+Del is unresponsive as well.

This problem goes away when amdgpu.dpm=0 is used but in that case dynamic power
scaling is not available and gpu stuck at low clock, graphics performance is
abysmal. Also GPU temp/fan speed utilities doesn't work.

Here is the excerpt of the problematic log lines:

Jun 02 09:54:05 kernel: amdgpu: [powerplay] 
                         last message was failed ret is 65535
Jun 02 09:54:06 kernel: amdgpu: [powerplay] 
                         failed to send message 15b ret is 65535 
Jun 02 09:54:06 kernel: hrtimer: interrupt took 287743313 ns
Jun 02 09:54:06 kernel: clocksource: timekeeping watchdog on CPU3: Marking
clocksource 'tsc' as unstable because the skew is too large:
Jun 02 09:54:06 kernel: clocksource:                       'hpet' wd_now:
628dd7b wd_last: 5fef431 mask: ffffffff
Jun 02 09:54:06 kernel: clocksource:                       'tsc' cs_now:
254aa24747 cs_last: 25104a5bfd mask: ffffffffffffffff
Jun 02 09:54:06 kernel: tsc: Marking TSC unstable due to clocksource watchdog
Jun 02 09:54:07 kernel: amdgpu: [powerplay] 
                         last message was failed ret is 65535
Jun 02 09:54:07 kernel: amdgpu: [powerplay] 
                         failed to send message 148 ret is 65535 
Jun 02 09:54:07 kernel: amdgpu: [powerplay] 
                         last message was failed ret is 65535
Jun 02 09:54:07 kernel: amdgpu: [powerplay] 
                         failed to send message 145 ret is 65535 
Jun 02 09:54:08 kernel: amdgpu: [powerplay] 
                         last message was failed ret is 65535
Jun 02 09:54:08 kernel: TSC found unstable after boot, most likely due to
broken BIOS. Use 'tsc=unstable'.
Jun 02 09:54:08 kernel: sched_clock: Marking unstable (8791691311,
362291)<-(8817904668, -25851212)
Jun 02 09:54:08 kernel: amdgpu: [powerplay] 
                         failed to send message 146 ret is 65535 
Jun 02 09:54:08 kernel: hid-generic 0003:09DA:FC7C.0003: input,hidraw2: USB HID
v1.11 Mouse [COMPANY USB Device] on usb-0000:00:1a.0-1.5.3/input0
Jun 02 09:54:09 kernel: hid-generic 0003:09DA:FC7C.0004: hiddev97,hidraw3: USB
HID v1.11 Device [COMPANY USB Device] on usb-0000:00:1a.0-1.5.3/input1
Jun 02 09:54:11 kernel: clocksource: Switched to clocksource hpet
Jun 02 09:54:13 kernel: amdgpu: [powerplay] 
                         last message was failed ret is 65535
Jun 02 09:54:13 kernel: amdgpu: [powerplay] 
                         failed to send message 260 ret is 65535 
Jun 02 09:54:14 kernel: amdgpu: [powerplay] 
                         last message was failed ret is 65535
Jun 02 09:54:14 kernel: amdgpu: [powerplay] 
                         failed to send message 260 ret is 65535 
Jun 02 09:54:14 kernel: amdgpu: [powerplay] 
                         last message was failed ret is 65535
Jun 02 09:54:14 kernel: amdgpu: [powerplay] 
                         failed to send message 260 ret is 65535 
Jun 02 09:54:14 kernel: amdgpu: [powerplay] 
                         last message was failed ret is 65535
Jun 02 09:54:14 kernel: amdgpu: [powerplay] 
                         failed to send message 260 ret is 65535 
Jun 02 09:54:14 kernel: amdgpu: [powerplay] 
                         last message was failed ret is 65535
Jun 02 09:54:14 kernel: amdgpu: [powerplay] 
                         failed to send message 260 ret is 65535 
Jun 02 09:54:14 kernel: amdgpu: [powerplay] 
                         last message was failed ret is 65535
Jun 02 09:54:14 kernel: amdgpu: [powerplay] 
                         failed to send message 260 ret is 65535 
Jun 02 09:54:14 kernel: amdgpu: [powerplay] 
                         last message was failed ret is 65535
Jun 02 09:54:14 kernel: amdgpu: [powerplay] 
                         failed to send message 260 ret is 65535 
Jun 02 09:54:14 kernel: amdgpu: [powerplay] 
                         last message was failed ret is 65535
Jun 02 09:54:14 kernel: amdgpu: [powerplay] 
                         failed to send message 260 ret is 65535 
Jun 02 09:54:14 kernel: amdgpu: [powerplay] 
                         last message was failed ret is 65535
Jun 02 09:54:15 kernel: amdgpu: [powerplay] 
                         failed to send message 260 ret is 65535 
Jun 02 09:54:15 kernel: amdgpu: [powerplay] 
                         last message was failed ret is 65535
Jun 02 09:54:15 kernel: amdgpu: [powerplay] 
                         failed to send message 260 ret is 65535 
Jun 02 09:54:15 kernel: amdgpu: [powerplay] 
                         last message was failed ret is 65535
Jun 02 09:54:16 kernel: amdgpu: [powerplay] 
                         failed to send message 260 ret is 65535 
Jun 02 09:54:16 kernel: amdgpu: [powerplay] 
                         last message was failed ret is 65535
Jun 02 09:54:16 kernel: amdgpu: [powerplay] 
                         failed to send message 260 ret is 65535 
Jun 02 09:54:16 kernel: amdgpu: [powerplay] 
                         last message was failed ret is 65535
Jun 02 09:54:16 kernel: amdgpu: [powerplay] 
                         failed to send message 260 ret is 65535 
Jun 02 09:54:16 kernel: amdgpu: [powerplay] 
                         last message was failed ret is 65535
Jun 02 09:54:16 kernel: amdgpu: [powerplay] 
                         failed to send message 260 ret is 65535 
Jun 02 09:54:16 kernel: amdgpu: [powerplay] 
                         last message was failed ret is 65535
Jun 02 09:54:16 kernel: amdgpu: [powerplay] 
                         failed to send message 260 ret is 65535 
Jun 02 09:54:16 kernel: amdgpu: [powerplay] 
                         last message was failed ret is 65535
Jun 02 09:54:16 kernel: amdgpu: [powerplay] 
                         failed to send message 260 ret is 65535 
Jun 02 09:54:16 kernel: amdgpu: [powerplay] 
                         last message was failed ret is 65535
Jun 02 09:54:16 kernel: amdgpu: [powerplay] 
                         failed to send message 260 ret is 65535 
Jun 02 09:54:17 kernel: [drm] Initialized amdgpu 3.30.0 20150101 for
0000:04:00.0 on minor 0
Jun 02 09:54:17 kernel: EXT4-fs (sda3): mounted filesystem with ordered data
mode. Opts: (null)
Jun 02 09:54:20 kernel: amdgpu 0000:04:00.0: [drm:amdgpu_ib_ring_tests
[amdgpu]] *ERROR* IB test failed on gfx (-110).
Jun 02 09:54:21 kernel: [drm:amdgpu_device_ip_late_init_func_handler [amdgpu]]
*ERROR* ib ring test failed (-110).

Any help is appreciated. Also let me know if I can help in any way.

-- 
You are receiving this mail because:
You are the assignee for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/dri-devel/attachments/20190603/aae19819/attachment-0001.html>


More information about the dri-devel mailing list