[Bug 99049] Machine freeze when clock are set to defaults

Sat Dec 10 21:41:57 UTC 2016

https://bugs.freedesktop.org/show_bug.cgi?id=99049

            Bug ID: 99049
           Summary: Machine freeze when clock are set to defaults
           Product: DRI
           Version: XOrg git
          Hardware: Other
                OS: All
            Status: NEW
          Severity: normal
          Priority: medium
         Component: DRM/Radeon
          Assignee: dri-devel at lists.freedesktop.org
          Reporter: ls at maxux.net

My laptop contains theses cards:
----
00:02.0 VGA compatible controller: Intel Corporation HD Graphics 520 (rev 07)
01:00.0 Display controller: Advanced Micro Devices, Inc. [AMD/ATI] Sun XT
[Radeon HD 8670A/8670M/8690M / R5 M330] (rev 81)
----

Under Gentoo, I enabled radeon and radeonsi as videos cards, everything looks
fine. According to xrandr, I have providers:
----
Provider 0: id: 0x76 cap: 0xb, Source Output, Sink Output, Sink Offload crtcs:
4 outputs: 3 associated providers: 0 name:Intel
Provider 1: id: 0x4f cap: 0xd, Source Output, Source Offload, Sink Offload
crtcs: 0 outputs: 0 associated providers: 0 name:HAINAN @ pci:0000:01:00.0
----

I found on the internet that DPM could cause issue, I tried: radeon.runpm=0
radeon.dpm=0 (see below)

Using theses settings, I don't have any freeze when I try to use the Radeon
Card (using DRI_PRIME=1), but I found that everything was slow. I checked, and
I saw:
---
cat /sys/class/drm/card1/device/power_method
profile

cat /sys/class/drm/card1/device/power_profile
default

cat /sys/kernel/debug/dri/65/radeon_pm_info
default engine clock: 1070000 kHz
current engine clock: 299990 kHz
default memory clock: 900000 kHz
current memory clock: 298990 kHz
voltage: 1150 mV
PCIE lanes: 4
---

The card is running in low profile by default, don't know why. Setting
power_profile to high, mid or low doesn't change anything, but if I set
power_profile back to default again, the clock is set to full speed.

When clock is set to full speed, my system freeze if I try to run any 3D
application (glxgears or a game using wine). Here is the dmesg log:
---
radeon 0000:01:00.0: ring 0 stalled for more than 10436msec
radeon 0000:01:00.0: GPU lockup (current fence id 0x00000000000014f4 last fence
id 0x00000000000014f6 on ring 0)
radeon 0000:01:00.0: Saved 49 dwords of commands on ring 0.
radeon 0000:01:00.0: GPU softreset: 0x00000049
radeon 0000:01:00.0:   GRBM_STATUS               = 0xE5D04028
radeon 0000:01:00.0:   GRBM_STATUS_SE0           = 0xEE400000
radeon 0000:01:00.0:   GRBM_STATUS_SE1           = 0x00000006
radeon 0000:01:00.0:   SRBM_STATUS               = 0x200000C0
radeon 0000:01:00.0:   SRBM_STATUS2              = 0x00000000
radeon 0000:01:00.0:   R_008674_CP_STALLED_STAT1 = 0x00000000
radeon 0000:01:00.0:   R_008678_CP_STALLED_STAT2 = 0x00018000
radeon 0000:01:00.0:   R_00867C_CP_BUSY_STAT     = 0x00008000
radeon 0000:01:00.0:   R_008680_CP_STAT          = 0x80030243
radeon 0000:01:00.0:   R_00D034_DMA_STATUS_REG   = 0x44C83D57
radeon 0000:01:00.0:   R_00D834_DMA_STATUS_REG   = 0x44C83D57
radeon 0000:01:00.0:   VM_CONTEXT1_PROTECTION_FAULT_ADDR   0x00000000
radeon 0000:01:00.0:   VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x00000000
radeon 0000:01:00.0: GRBM_SOFT_RESET=0x0000DDFF
radeon 0000:01:00.0: SRBM_SOFT_RESET=0x00000100
radeon 0000:01:00.0:   GRBM_STATUS               = 0x00003028
radeon 0000:01:00.0:   GRBM_STATUS_SE0           = 0x00000006
radeon 0000:01:00.0:   GRBM_STATUS_SE1           = 0x00000006
radeon 0000:01:00.0:   SRBM_STATUS               = 0x200000C0
radeon 0000:01:00.0:   SRBM_STATUS2              = 0x00000000
radeon 0000:01:00.0:   R_008674_CP_STALLED_STAT1 = 0x00000000
radeon 0000:01:00.0:   R_008678_CP_STALLED_STAT2 = 0x00000000
radeon 0000:01:00.0:   R_00867C_CP_BUSY_STAT     = 0x00000000
radeon 0000:01:00.0:   R_008680_CP_STAT          = 0x00000000
radeon 0000:01:00.0:   R_00D034_DMA_STATUS_REG   = 0x44C83D57
radeon 0000:01:00.0:   R_00D834_DMA_STATUS_REG   = 0x44C83D57
radeon 0000:01:00.0: GPU reset succeeded, trying to resume
[drm] probing gen 2 caps for device 8086:9d10 = 1724843/e
[drm] PCIE gen 3 link speeds already enabled
[drm] PCIE GART of 2048M enabled (table at 0x0000000000040000).
radeon 0000:01:00.0: WB enabled
radeon 0000:01:00.0: fence driver on ring 0 use gpu addr 0x0000000100000c00 and
cpu addr 0xffff88046c66dc00
radeon 0000:01:00.0: fence driver on ring 1 use gpu addr 0x0000000100000c04 and
cpu addr 0xffff88046c66dc04
radeon 0000:01:00.0: fence driver on ring 2 use gpu addr 0x0000000100000c08 and
cpu addr 0xffff88046c66dc08
radeon 0000:01:00.0: fence driver on ring 3 use gpu addr 0x0000000100000c0c and
cpu addr 0xffff88046c66dc0c
radeon 0000:01:00.0: fence driver on ring 4 use gpu addr 0x0000000100000c10 and
cpu addr 0xffff88046c66dc10
[drm:r600_ring_test [radeon]] *ERROR* radeon: ring 0 test failed
(scratch(0x850C)=0xCAFEDEAD)
[drm:si_resume [radeon]] *ERROR* si startup failed on resume
---

I found out with theses steps that, if I don't set dpm=0, I hit exactly the
same issue, I guess using DPM the clock is set to high when the card is used
and it crash. When the card is stuck on that loop, I need to reset the machine
(but network still works).

I'm using mesa-13.0.2 with a 4.7.6 kernel, I have the same issue using mesa
12.0.1

-- 
You are receiving this mail because:
You are the assignee for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/dri-devel/attachments/20161210/ff19c233/attachment.html>