[Bug 78221] 3.16 RC1: AMD R9 270 GPU locks up on some heavy 2D activity - GPU VM fault occurs. (possibly DMA copying issue strikes back?)

bugzilla-daemon at bugzilla.kernel.org bugzilla-daemon at bugzilla.kernel.org
Mon Sep 29 21:03:23 PDT 2014


https://bugzilla.kernel.org/show_bug.cgi?id=78221

Jean-Michel Smith <jean.michel.sm at gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |jean.michel.sm at gmail.com

--- Comment #23 from Jean-Michel Smith <jean.michel.sm at gmail.com> ---
I've seen this bug as well, through quite a few versions of 3.15 and 3.16. 
Sometimes it just freezes X, other times it hangs the entire system.  Here is
the output of the last hang (I was able to log in remotely as this time it
didn't completely crash the system)

01:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI]
Curacao XT [Radeon R9 270X]

(uname -a)
Linux prime 3.16.3-gentoo #1 SMP PREEMPT Thu Sep 18 20:59:58 CDT 2014 x86_64
Intel(R) Core(TM) i7-3930K CPU @ 3.20GHz GenuineIntel GNU/Linux

(lsmod)
cfbfillrect             3634  1 radeon
cfbimgblt               2055  1 radeon
cfbcopyarea             3110  1 radeon
i2c_algo_bit            5055  1 radeon
drm_kms_helper         33715  1 radeon
ttm                    59052  1 radeon
drm                   226864  6 ttm,drm_kms_helper,radeon
firmware_class          8187  1 radeon
radeon               1258462  3 

(relevant dmesg info)

[120499.589293] radeon 0000:01:00.0: ring 0 stalled for more than 10473msec
[120499.589296] radeon 0000:01:00.0: GPU lockup (waiting for 0x00000000000783d0
last fence id 0x00000000000783cf on ring 0)
[120499.589299] radeon 0000:01:00.0: failed to get a new IB (-35)
[120500.099613] radeon 0000:01:00.0: Saved 3600 dwords of commands on ring 0.
[120500.099743] radeon 0000:01:00.0: GPU softreset: 0x0000006C
[120500.099746] radeon 0000:01:00.0:   GRBM_STATUS               = 0xA0003028
[120500.099748] radeon 0000:01:00.0:   GRBM_STATUS_SE0           = 0x00000006
[120500.099750] radeon 0000:01:00.0:   GRBM_STATUS_SE1           = 0x00000006
[120500.099751] radeon 0000:01:00.0:   SRBM_STATUS               = 0x20000AC0
[120500.099862] radeon 0000:01:00.0:   SRBM_STATUS2              = 0x00000000
[120500.099864] radeon 0000:01:00.0:   R_008674_CP_STALLED_STAT1 = 0x00000000
[120500.099866] radeon 0000:01:00.0:   R_008678_CP_STALLED_STAT2 = 0x00010000
[120500.099868] radeon 0000:01:00.0:   R_00867C_CP_BUSY_STAT     = 0x00000002
[120500.099870] radeon 0000:01:00.0:   R_008680_CP_STAT          = 0x80010243
[120500.099872] radeon 0000:01:00.0:   R_00D034_DMA_STATUS_REG   = 0x44C83146
[120500.099874] radeon 0000:01:00.0:   R_00D834_DMA_STATUS_REG   = 0x44E84266
[120500.099876] radeon 0000:01:00.0:   VM_CONTEXT1_PROTECTION_FAULT_ADDR  
0x00000000
[120500.099879] radeon 0000:01:00.0:   VM_CONTEXT1_PROTECTION_FAULT_STATUS
0x00000000
[120500.592138] radeon 0000:01:00.0: GRBM_SOFT_RESET=0x0000DDFF
[120500.592192] radeon 0000:01:00.0: SRBM_SOFT_RESET=0x00100140
[120500.593350] radeon 0000:01:00.0:   GRBM_STATUS               = 0x00003028
[120500.593352] radeon 0000:01:00.0:   GRBM_STATUS_SE0           = 0x00000006
[120500.593354] radeon 0000:01:00.0:   GRBM_STATUS_SE1           = 0x00000006
[120500.593356] radeon 0000:01:00.0:   SRBM_STATUS               = 0x20000AC0
[120500.593466] radeon 0000:01:00.0:   SRBM_STATUS2              = 0x00000000
[120500.593468] radeon 0000:01:00.0:   R_008674_CP_STALLED_STAT1 = 0x00000000
[120500.593470] radeon 0000:01:00.0:   R_008678_CP_STALLED_STAT2 = 0x00000000
[120500.593472] radeon 0000:01:00.0:   R_00867C_CP_BUSY_STAT     = 0x00000000
[120500.593473] radeon 0000:01:00.0:   R_008680_CP_STAT          = 0x00000000
[120500.593475] radeon 0000:01:00.0:   R_00D034_DMA_STATUS_REG   = 0x44C83D57
[120500.593477] radeon 0000:01:00.0:   R_00D834_DMA_STATUS_REG   = 0x44C83D57
[120500.593718] radeon 0000:01:00.0: GPU reset succeeded, trying to resume
[120500.621478] [drm] probing gen 2 caps for device 8086:3c04 = 7a7103/e
[120500.621482] [drm] PCIE gen 3 link speeds already enabled
[120500.623908] [drm] PCIE GART of 1024M enabled (table at 0x0000000000276000).
[120500.624051] radeon 0000:01:00.0: WB enabled
[120500.624054] radeon 0000:01:00.0: fence driver on ring 0 use gpu addr
0x0000000100000c00 and cpu addr 0xffff8807fb4aac00
[120500.624056] radeon 0000:01:00.0: fence driver on ring 1 use gpu addr
0x0000000100000c04 and cpu addr 0xffff8807fb4aac04
[120500.624058] radeon 0000:01:00.0: fence driver on ring 2 use gpu addr
0x0000000100000c08 and cpu addr 0xffff8807fb4aac08
[120500.624059] radeon 0000:01:00.0: fence driver on ring 3 use gpu addr
0x0000000100000c0c and cpu addr 0xffff8807fb4aac0c
[120500.624061] radeon 0000:01:00.0: fence driver on ring 4 use gpu addr
0x0000000100000c10 and cpu addr 0xffff8807fb4aac10
[120500.624680] radeon 0000:01:00.0: fence driver on ring 5 use gpu addr
0x0000000000075a18 and cpu addr 0xffffc900142b5a18
[120500.789277] [drm] ring test on 0 succeeded in 3 usecs
[120500.789283] [drm] ring test on 1 succeeded in 1 usecs
[120500.789287] [drm] ring test on 2 succeeded in 1 usecs
[120500.789351] [drm] ring test on 3 succeeded in 2 usecs
[120500.789361] [drm] ring test on 4 succeeded in 1 usecs
[120500.981448] [drm] ring test on 5 succeeded in 2 usecs
[120500.981456] [drm] UVD initialized successfully.
[120510.981602] radeon 0000:01:00.0: ring 0 stalled for more than 10002msec
[120510.981604] radeon 0000:01:00.0: GPU lockup (waiting for 0x0000000000078407
last fence id 0x00000000000783cf on ring 0)
[120510.981606] [drm:r600_ib_test] *ERROR* radeon: fence wait failed (-35).
[120510.981608] [drm:radeon_ib_ring_tests] *ERROR* radeon: failed testing IB on
GFX ring (-35).
[120510.981609] radeon 0000:01:00.0: ib ring test failed (-35).
[120511.461309] radeon 0000:01:00.0: GPU softreset: 0x00000048
[120511.461310] radeon 0000:01:00.0:   GRBM_STATUS               = 0xA0003028
[120511.461312] radeon 0000:01:00.0:   GRBM_STATUS_SE0           = 0x00000006
[120511.461313] radeon 0000:01:00.0:   GRBM_STATUS_SE1           = 0x00000006
[120511.461314] radeon 0000:01:00.0:   SRBM_STATUS               = 0x200000C0
[120511.461428] radeon 0000:01:00.0:   SRBM_STATUS2              = 0x00000000
[120511.461429] radeon 0000:01:00.0:   R_008674_CP_STALLED_STAT1 = 0x00000000
[120511.461431] radeon 0000:01:00.0:   R_008678_CP_STALLED_STAT2 = 0x00010000
[120511.461432] radeon 0000:01:00.0:   R_00867C_CP_BUSY_STAT     = 0x00000002
[120511.461434] radeon 0000:01:00.0:   R_008680_CP_STAT          = 0x80010243
[120511.461435] radeon 0000:01:00.0:   R_00D034_DMA_STATUS_REG   = 0x44C83D57
[120511.461437] radeon 0000:01:00.0:   R_00D834_DMA_STATUS_REG   = 0x44C83D57
[120511.461439] radeon 0000:01:00.0:   VM_CONTEXT1_PROTECTION_FAULT_ADDR  
0x00000000
[120511.461440] radeon 0000:01:00.0:   VM_CONTEXT1_PROTECTION_FAULT_STATUS
0x00000000
[120511.933287] radeon 0000:01:00.0: GRBM_SOFT_RESET=0x0000DDFF
[120511.933340] radeon 0000:01:00.0: SRBM_SOFT_RESET=0x00000100
[120511.934495] radeon 0000:01:00.0:   GRBM_STATUS               = 0x00003028
[120511.934496] radeon 0000:01:00.0:   GRBM_STATUS_SE0           = 0x00000006
[120511.934498] radeon 0000:01:00.0:   GRBM_STATUS_SE1           = 0x00000006
[120511.934499] radeon 0000:01:00.0:   SRBM_STATUS               = 0x200000C0
[120511.934609] radeon 0000:01:00.0:   SRBM_STATUS2              = 0x00000000
[120511.934610] radeon 0000:01:00.0:   R_008674_CP_STALLED_STAT1 = 0x00000000
[120511.934612] radeon 0000:01:00.0:   R_008678_CP_STALLED_STAT2 = 0x00000000
[120511.934613] radeon 0000:01:00.0:   R_00867C_CP_BUSY_STAT     = 0x00000000
[120511.934614] radeon 0000:01:00.0:   R_008680_CP_STAT          = 0x00000000
[120511.934616] radeon 0000:01:00.0:   R_00D034_DMA_STATUS_REG   = 0x44C83D57
[120511.934617] radeon 0000:01:00.0:   R_00D834_DMA_STATUS_REG   = 0x44C83D57
[120511.934857] radeon 0000:01:00.0: GPU reset succeeded, trying to resume
[120511.945176] [drm] probing gen 2 caps for device 8086:3c04 = 7a7103/e
[120511.945179] [drm] PCIE gen 3 link speeds already enabled
[120511.947127] [drm] PCIE GART of 1024M enabled (table at 0x0000000000276000).
[120511.947253] radeon 0000:01:00.0: WB enabled
[120511.947255] radeon 0000:01:00.0: fence driver on ring 0 use gpu addr
0x0000000100000c00 and cpu addr 0xffff8807fb4aac00
[120511.947256] radeon 0000:01:00.0: fence driver on ring 1 use gpu addr
0x0000000100000c04 and cpu addr 0xffff8807fb4aac04
[120511.947257] radeon 0000:01:00.0: fence driver on ring 2 use gpu addr
0x0000000100000c08 and cpu addr 0xffff8807fb4aac08
[120511.947258] radeon 0000:01:00.0: fence driver on ring 3 use gpu addr
0x0000000100000c0c and cpu addr 0xffff8807fb4aac0c
[120511.947259] radeon 0000:01:00.0: fence driver on ring 4 use gpu addr
0x0000000100000c10 and cpu addr 0xffff8807fb4aac10
[120511.947868] radeon 0000:01:00.0: fence driver on ring 5 use gpu addr
0x0000000000075a18 and cpu addr 0xffffc900142b5a18
[120512.109348] [drm] ring test on 0 succeeded in 4 usecs
[120512.109352] [drm] ring test on 1 succeeded in 1 usecs
[120512.109355] [drm] ring test on 2 succeeded in 1 usecs
[120512.109417] [drm] ring test on 3 succeeded in 2 usecs
[120512.109426] [drm] ring test on 4 succeeded in 1 usecs
[120512.286478] [drm] ring test on 5 succeeded in 2 usecs
[120512.286483] [drm] UVD initialized successfully.
[120512.286534] [drm] ib test on ring 0 succeeded in 0 usecs
[120512.286580] [drm] ib test on ring 1 succeeded in 0 usecs
[120512.286623] [drm] ib test on ring 2 succeeded in 0 usecs
[120512.286648] [drm] ib test on ring 3 succeeded in 0 usecs
[120512.286672] [drm] ib test on ring 4 succeeded in 0 usecs
[120522.435679] radeon 0000:01:00.0: ring 5 stalled for more than 10000msec
[120522.435685] radeon 0000:01:00.0: GPU lockup (waiting for 0x0000000000000004
last fence id 0x0000000000000002 on ring 5)
[120522.435688] [drm:uvd_v1_0_ib_test] *ERROR* radeon: fence wait failed (-35).
[120522.435695] [drm:radeon_ib_ring_tests] *ERROR* radeon: failed testing IB on
ring 5 (-35).
[120522.435730] [drm:radeon_pm_resume_dpm] *ERROR* radeon: dpm resume failed

-- 
You are receiving this mail because:
You are watching the assignee of the bug.


More information about the dri-devel mailing list