<html>
<head>
<base href="https://bugs.freedesktop.org/" />
</head>
<body><table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Priority</th>
<td>medium
</td>
</tr>
<tr>
<th>Bug ID</th>
<td><a class="bz_bug_link
bz_status_NEW "
title="NEW --- - [r600][RV635] GPU lockup CP stall / GPU resets over and over - Kernel 3.7, 3.8-rcX"
href="https://bugs.freedesktop.org/show_bug.cgi?id=59649">59649</a>
</td>
</tr>
<tr>
<th>Assignee</th>
<td>dri-devel@lists.freedesktop.org
</td>
</tr>
<tr>
<th>Summary</th>
<td>[r600][RV635] GPU lockup CP stall / GPU resets over and over - Kernel 3.7, 3.8-rcX
</td>
</tr>
<tr>
<th>Severity</th>
<td>major
</td>
</tr>
<tr>
<th>Classification</th>
<td>Unclassified
</td>
</tr>
<tr>
<th>OS</th>
<td>Linux (All)
</td>
</tr>
<tr>
<th>Reporter</th>
<td>shawn.starr@rogers.com
</td>
</tr>
<tr>
<th>Hardware</th>
<td>x86-64 (AMD64)
</td>
</tr>
<tr>
<th>Status</th>
<td>NEW
</td>
</tr>
<tr>
<th>Version</th>
<td>9.0
</td>
</tr>
<tr>
<th>Component</th>
<td>Drivers/Gallium/r600
</td>
</tr>
<tr>
<th>Product</th>
<td>Mesa
</td>
</tr></table>
<p>
<div>
<pre>Using Linux kernel 3.7 and up to 3.8-rc3 Unable to have a stable session with
my RV635 GPU
Jan 19 03:45:26 segfault kernel: [15008.313696] radeon 0000:01:00.0: Saved 185
dwords of commands on ring 0.
Jan 19 03:45:26 segfault kernel: [15008.313704] radeon 0000:01:00.0: GPU
softreset
Jan 19 03:45:26 segfault kernel: [15008.313711] radeon 0000:01:00.0:
R_008010_GRBM_STATUS=0xA0003030
Jan 19 03:45:26 segfault kernel: [15008.313717] radeon 0000:01:00.0:
R_008014_GRBM_STATUS2=0x00000003
Jan 19 03:45:26 segfault kernel: [15008.313723] radeon 0000:01:00.0:
R_000E50_SRBM_STATUS=0x200000C0
Jan 19 03:45:26 segfault kernel: [15008.313730] radeon 0000:01:00.0:
R_008674_CP_STALLED_STAT1 = 0x00000000
Jan 19 03:45:26 segfault kernel: [15008.313736] radeon 0000:01:00.0:
R_008678_CP_STALLED_STAT2 = 0x00000000
Jan 19 03:45:26 segfault kernel: [15008.313742] radeon 0000:01:00.0:
R_00867C_CP_BUSY_STAT = 0x00000006
Jan 19 03:45:26 segfault kernel: [15008.313748] radeon 0000:01:00.0:
R_008680_CP_STAT = 0x80000645
Jan 19 03:45:26 segfault kernel: [15008.313761] radeon 0000:01:00.0:
R_008020_GRBM_SOFT_RESET=0x00007FEE
Jan 19 03:45:26 segfault kernel: [15008.328772] radeon 0000:01:00.0:
R_008020_GRBM_SOFT_RESET=0x00000001
Jan 19 03:45:26 segfault kernel: [15008.344782] radeon 0000:01:00.0:
R_008010_GRBM_STATUS=0xA0003030
Jan 19 03:45:26 segfault kernel: [15008.344785] radeon 0000:01:00.0:
R_008014_GRBM_STATUS2=0x00000003
Jan 19 03:45:26 segfault kernel: [15008.344787] radeon 0000:01:00.0:
R_000E50_SRBM_STATUS=0x200080C0
Jan 19 03:45:26 segfault kernel: [15008.344789] radeon 0000:01:00.0:
R_008674_CP_STALLED_STAT1 = 0x00000000
Jan 19 03:45:26 segfault kernel: [15008.344792] radeon 0000:01:00.0:
R_008678_CP_STALLED_STAT2 = 0x00000000
Jan 19 03:45:26 segfault kernel: [15008.344794] radeon 0000:01:00.0:
R_00867C_CP_BUSY_STAT = 0x00000000
Jan 19 03:45:26 segfault kernel: [15008.344797] radeon 0000:01:00.0:
R_008680_CP_STAT = 0x80100000
Jan 19 03:45:26 segfault kernel: [15008.345799] radeon 0000:01:00.0: GPU reset
succeeded, trying to resume
Jan 19 03:45:26 segfault kernel: [15008.348414] [drm] probing gen 2 caps for
device 8086:2a41 = 1/0
Jan 19 03:45:26 segfault kernel: [15008.350360] [drm] PCIE GART of 512M enabled
(table at 0x0000000000040000).
Jan 19 03:45:26 segfault kernel: [15008.350399] radeon 0000:01:00.0: WB enabled
Jan 19 03:45:26 segfault kernel: [15008.350403] radeon 0000:01:00.0: fence
driver on ring 0 use gpu addr 0x0000000020000c00 and cpu addr
0xffff880229236c00
Jan 19 03:45:26 segfault kernel: [15008.381778] [drm] ring test on 0 succeeded
in 1 usecs
Jan 19 03:45:26 segfault kernel: [15008.384549] [drm] ib test on ring 0
succeeded in 0 usecs
Jan 19 03:46:12 segfault kernel: [15053.625108] radeon 0000:01:00.0: GPU lockup
CP stall for more than 10000msec
...
Jan 19 03:46:12 segfault kernel: [15053.975428] radeon 0000:01:00.0: Wait for
MC idle timedout !
Jan 19 03:46:12 segfault kernel: [15054.123890] radeon 0000:01:00.0: Wait for
MC idle timedout !
Jan 19 03:46:12 segfault kernel: [15054.125748] [drm] PCIE GART of 512M enabled
(table at 0x0000000000040000).
Jan 19 03:46:12 segfault kernel: [15054.125785] radeon 0000:01:00.0: WB enabled
Jan 19 03:46:12 segfault kernel: [15054.125789] radeon 0000:01:00.0: fence
driver on ring 0 use gpu addr 0x0000000020000c00 and cpu addr
0xffff880229236c00
Jan 19 03:46:12 segfault kernel: [15054.157608] [drm] ring test on 0 succeeded
in 0 usecs
Jan 19 03:46:23 segfault kernel: [15064.657103] radeon 0000:01:00.0: GPU lockup
CP stall for more than 10000msec
Jan 19 03:46:23 segfault kernel: [15064.657114] radeon 0000:01:00.0: GPU lockup
(waiting for 0x00000000000441b6 last fence id 0x00000000000441a8)
Jan 19 03:46:23 segfault kernel: [15064.657121] [drm:r600_ib_test] *ERROR*
radeon: fence wait failed (-35).
Jan 19 03:46:23 segfault kernel: [15064.657134] [drm:radeon_ib_ring_tests]
*ERROR* radeon: failed testing IB on GFX ring (-35).
Jan 19 03:46:23 segfault kernel: [15064.657140] radeon 0000:01:00.0: ib ring
test failed (-35).
Jan 19 03:46:23 segfault kernel: [15064.658211] radeon 0000:01:00.0: GPU
softreset
Jan 19 03:46:23 segfault kernel: [15064.658218] radeon 0000:01:00.0:
R_008010_GRBM_STATUS=0xE57C24E0
Jan 19 03:46:23 segfault kernel: [15064.658224] radeon 0000:01:00.0:
R_008014_GRBM_STATUS2=0x00113303
Jan 19 03:46:23 segfault kernel: [15064.658230] radeon 0000:01:00.0:
R_000E50_SRBM_STATUS=0x200030C0
Jan 19 03:46:23 segfault kernel: [15064.658236] radeon 0000:01:00.0:
R_008674_CP_STALLED_STAT1 = 0x01000000
Jan 19 03:46:23 segfault kernel: [15064.658242] radeon 0000:01:00.0:
R_008678_CP_STALLED_STAT2 = 0x00001002
Jan 19 03:46:23 segfault kernel: [15064.658248] radeon 0000:01:00.0:
R_00867C_CP_BUSY_STAT = 0x00028482
Jan 19 03:46:23 segfault kernel: [15064.658254] radeon 0000:01:00.0:
R_008680_CP_STAT = 0x80838645
Jan 19 03:46:23 segfault kernel: [15064.829116] radeon 0000:01:00.0: Wait for
MC idle timedout !
Jan 19 03:46:23 segfault kernel: [15064.829123] radeon 0000:01:00.0:
R_008020_GRBM_SOFT_RESET=0x00007FEE
Jan 19 03:46:23 segfault kernel: [15064.844133] radeon 0000:01:00.0:
R_008020_GRBM_SOFT_RESET=0x00000001
Jan 19 03:46:23 segfault kernel: [15064.860144] radeon 0000:01:00.0:
R_008010_GRBM_STATUS=0xA0003030
Jan 19 03:46:23 segfault kernel: [15064.860150] radeon 0000:01:00.0:
R_008014_GRBM_STATUS2=0x00000003
an 19 03:46:23 segfault kernel: [15064.860163] radeon 0000:01:00.0:
R_000E50_SRBM_STATUS=0x2000B0C0
Jan 19 03:46:23 segfault kernel: [15064.860169] radeon 0000:01:00.0:
R_008674_CP_STALLED_STAT1 = 0x00000000
Jan 19 03:46:23 segfault kernel: [15064.860175] radeon 0000:01:00.0:
R_008678_CP_STALLED_STAT2 = 0x00000000
Jan 19 03:46:23 segfault kernel: [15064.860181] radeon 0000:01:00.0:
R_00867C_CP_BUSY_STAT = 0x00000000
Jan 19 03:46:23 segfault kernel: [15064.860191] radeon 0000:01:00.0:
R_008680_CP_STAT = 0x80100000
Jan 19 03:46:23 segfault kernel: [15064.861197] radeon 0000:01:00.0: GPU reset
succeeded, trying to resume
Jan 19 04:39:23 segfault kernel: [ 2791.671107] [drm:r600_ib_test] *ERROR*
radeon: fence wait failed (-35).
Jan 19 04:39:23 segfault kernel: [ 2791.671115] [drm:radeon_ib_ring_tests]
*ERROR* radeon: failed testing IB on GFX ring (-35).
Then floods console with
[drm:radeon_cs_ib_chunk] *ERROR* Failed to schedule IB !
radeon 0000:01:00.0: couldn't schedule ib (over and over)
mesa-dri-drivers-9.0.1-3.fc18.x86_64
libdrm-2.4.40-1.fc18.x86_64
kernels: kernel-3.7.3-201.fc18.x86_64,
kernel-devel-3.8.0-0.rc3.git1.2.fc19.x86_64
I have not tried on 3.8-rc4 yet
Laptop: Lenovo ThinkPad W500</pre>
</div>
</p>
<hr>
<span>You are receiving this mail because:</span>
<ul>
<li>You are the assignee for the bug.</li>
</ul>
</body>
</html>