[Bug 203779] New: drm:amdgpu_ib_ring_tests [amdgpu] *ERROR* IB test failed on gfx (-110)

bugzilla-daemon at bugzilla.kernel.org bugzilla-daemon at bugzilla.kernel.org
Sun Jun 2 04:27:03 UTC 2019


https://bugzilla.kernel.org/show_bug.cgi?id=203779

            Bug ID: 203779
           Summary: drm:amdgpu_ib_ring_tests [amdgpu] *ERROR* IB test
                    failed on gfx (-110)
           Product: Drivers
           Version: 2.5
    Kernel Version: 5.1.6
          Hardware: x86-64
                OS: Linux
              Tree: Mainline
            Status: NEW
          Severity: blocking
          Priority: P1
         Component: Video(DRI - non Intel)
          Assignee: drivers_video-dri at kernel-bugs.osdl.org
          Reporter: gobinda.joy at gmail.com
        Regression: No

My hardware is as follows:
CPU: i7 3770 at stock clock
Motherboard: Gigabyte G1.Sniper 3 latest BIOS available
RAM: 24 GB DDR3 at 1600 mhz
GPU: RX 580 8GB (Sapphire) latest VBIOS

The problem is with kernel 5.1.0 or higher (currently 5.1.6) Display hangs when
amdgpu driver loads. I'm unable to determine if the booting is continued or
hangs as well. Disk activity stops after couple seconds and not possible to
switch TTY.
Ctrl+Alt+Del is unresponsive as well.

This problem goes away when amdgpu.dpm=0 is used but in that case dynamic power
scaling is not available and gpu stuck at low clock, graphics performance is
abysmal. Also GPU temp/fan speed utilities doesn't work.

Here is the excerpt of the problematic log lines:

Jun 02 09:54:05 kernel: amdgpu: [powerplay] 
                         last message was failed ret is 65535
Jun 02 09:54:06 kernel: amdgpu: [powerplay] 
                         failed to send message 15b ret is 65535 
Jun 02 09:54:06 kernel: hrtimer: interrupt took 287743313 ns
Jun 02 09:54:06 kernel: clocksource: timekeeping watchdog on CPU3: Marking
clocksource 'tsc' as unstable because the skew is too large:
Jun 02 09:54:06 kernel: clocksource:                       'hpet' wd_now:
628dd7b wd_last: 5fef431 mask: ffffffff
Jun 02 09:54:06 kernel: clocksource:                       'tsc' cs_now:
254aa24747 cs_last: 25104a5bfd mask: ffffffffffffffff
Jun 02 09:54:06 kernel: tsc: Marking TSC unstable due to clocksource watchdog
Jun 02 09:54:07 kernel: amdgpu: [powerplay] 
                         last message was failed ret is 65535
Jun 02 09:54:07 kernel: amdgpu: [powerplay] 
                         failed to send message 148 ret is 65535 
Jun 02 09:54:07 kernel: amdgpu: [powerplay] 
                         last message was failed ret is 65535
Jun 02 09:54:07 kernel: amdgpu: [powerplay] 
                         failed to send message 145 ret is 65535 
Jun 02 09:54:08 kernel: amdgpu: [powerplay] 
                         last message was failed ret is 65535
Jun 02 09:54:08 kernel: TSC found unstable after boot, most likely due to
broken BIOS. Use 'tsc=unstable'.
Jun 02 09:54:08 kernel: sched_clock: Marking unstable (8791691311,
362291)<-(8817904668, -25851212)
Jun 02 09:54:08 kernel: amdgpu: [powerplay] 
                         failed to send message 146 ret is 65535 
Jun 02 09:54:08 kernel: hid-generic 0003:09DA:FC7C.0003: input,hidraw2: USB HID
v1.11 Mouse [COMPANY USB Device] on usb-0000:00:1a.0-1.5.3/input0
Jun 02 09:54:09 kernel: hid-generic 0003:09DA:FC7C.0004: hiddev97,hidraw3: USB
HID v1.11 Device [COMPANY USB Device] on usb-0000:00:1a.0-1.5.3/input1
Jun 02 09:54:11 kernel: clocksource: Switched to clocksource hpet
Jun 02 09:54:13 kernel: amdgpu: [powerplay] 
                         last message was failed ret is 65535
Jun 02 09:54:13 kernel: amdgpu: [powerplay] 
                         failed to send message 260 ret is 65535 
Jun 02 09:54:14 kernel: amdgpu: [powerplay] 
                         last message was failed ret is 65535
Jun 02 09:54:14 kernel: amdgpu: [powerplay] 
                         failed to send message 260 ret is 65535 
Jun 02 09:54:14 kernel: amdgpu: [powerplay] 
                         last message was failed ret is 65535
Jun 02 09:54:14 kernel: amdgpu: [powerplay] 
                         failed to send message 260 ret is 65535 
Jun 02 09:54:14 kernel: amdgpu: [powerplay] 
                         last message was failed ret is 65535
Jun 02 09:54:14 kernel: amdgpu: [powerplay] 
                         failed to send message 260 ret is 65535 
Jun 02 09:54:14 kernel: amdgpu: [powerplay] 
                         last message was failed ret is 65535
Jun 02 09:54:14 kernel: amdgpu: [powerplay] 
                         failed to send message 260 ret is 65535 
Jun 02 09:54:14 kernel: amdgpu: [powerplay] 
                         last message was failed ret is 65535
Jun 02 09:54:14 kernel: amdgpu: [powerplay] 
                         failed to send message 260 ret is 65535 
Jun 02 09:54:14 kernel: amdgpu: [powerplay] 
                         last message was failed ret is 65535
Jun 02 09:54:14 kernel: amdgpu: [powerplay] 
                         failed to send message 260 ret is 65535 
Jun 02 09:54:14 kernel: amdgpu: [powerplay] 
                         last message was failed ret is 65535
Jun 02 09:54:14 kernel: amdgpu: [powerplay] 
                         failed to send message 260 ret is 65535 
Jun 02 09:54:14 kernel: amdgpu: [powerplay] 
                         last message was failed ret is 65535
Jun 02 09:54:15 kernel: amdgpu: [powerplay] 
                         failed to send message 260 ret is 65535 
Jun 02 09:54:15 kernel: amdgpu: [powerplay] 
                         last message was failed ret is 65535
Jun 02 09:54:15 kernel: amdgpu: [powerplay] 
                         failed to send message 260 ret is 65535 
Jun 02 09:54:15 kernel: amdgpu: [powerplay] 
                         last message was failed ret is 65535
Jun 02 09:54:16 kernel: amdgpu: [powerplay] 
                         failed to send message 260 ret is 65535 
Jun 02 09:54:16 kernel: amdgpu: [powerplay] 
                         last message was failed ret is 65535
Jun 02 09:54:16 kernel: amdgpu: [powerplay] 
                         failed to send message 260 ret is 65535 
Jun 02 09:54:16 kernel: amdgpu: [powerplay] 
                         last message was failed ret is 65535
Jun 02 09:54:16 kernel: amdgpu: [powerplay] 
                         failed to send message 260 ret is 65535 
Jun 02 09:54:16 kernel: amdgpu: [powerplay] 
                         last message was failed ret is 65535
Jun 02 09:54:16 kernel: amdgpu: [powerplay] 
                         failed to send message 260 ret is 65535 
Jun 02 09:54:16 kernel: amdgpu: [powerplay] 
                         last message was failed ret is 65535
Jun 02 09:54:16 kernel: amdgpu: [powerplay] 
                         failed to send message 260 ret is 65535 
Jun 02 09:54:16 kernel: amdgpu: [powerplay] 
                         last message was failed ret is 65535
Jun 02 09:54:16 kernel: amdgpu: [powerplay] 
                         failed to send message 260 ret is 65535 
Jun 02 09:54:16 kernel: amdgpu: [powerplay] 
                         last message was failed ret is 65535
Jun 02 09:54:16 kernel: amdgpu: [powerplay] 
                         failed to send message 260 ret is 65535 
Jun 02 09:54:17 kernel: [drm] Initialized amdgpu 3.30.0 20150101 for
0000:04:00.0 on minor 0
Jun 02 09:54:17 kernel: EXT4-fs (sda3): mounted filesystem with ordered data
mode. Opts: (null)
Jun 02 09:54:20 kernel: amdgpu 0000:04:00.0: [drm:amdgpu_ib_ring_tests
[amdgpu]] *ERROR* IB test failed on gfx (-110).
Jun 02 09:54:21 kernel: [drm:amdgpu_device_ip_late_init_func_handler [amdgpu]]
*ERROR* ib ring test failed (-110).

Any help is appreciated. Also let me know if I can help in any way.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.


More information about the dri-devel mailing list