[Bug 42678] [3.3-rc1] radeon stuck in kernel after lockup
bugzilla-daemon at bugzilla.kernel.org
bugzilla-daemon at bugzilla.kernel.org
Sat Feb 4 00:39:45 PST 2012
https://bugzilla.kernel.org/show_bug.cgi?id=42678
--- Comment #3 from Torsten Kaiser <just.for.lkml at googlemail.com> 2012-02-04 08:39:42 ---
The fix for the lockup itself in now in mainline and should be released in
3.3-rc3.
But I can confirm that the regression (that X is no longer recovering from the
GPU lockup / GPU reset) is still there in 3.3-rc2.
For my log, first the lockup:
Feb 4 08:55:25 thoregon kernel: [15457.570126] radeon 0000:07:00.0: GPU lockup
CP stall for more than 10000msec
Feb 4 08:55:25 thoregon kernel: [15457.570134] GPU lockup (waiting for
0x00070CAA last fence id 0x00070CA9)
Feb 4 08:55:25 thoregon kernel: [15457.586330] radeon 0000:07:00.0: GPU
softreset
Feb 4 08:55:25 thoregon kernel: [15457.586337] radeon 0000:07:00.0:
R_008010_GRBM_STATUS=0xA0003028
Feb 4 08:55:25 thoregon kernel: [15457.586343] radeon 0000:07:00.0:
R_008014_GRBM_STATUS2=0x00000002
Feb 4 08:55:25 thoregon kernel: [15457.586349] radeon 0000:07:00.0:
R_000E50_SRBM_STATUS=0x200000C0
Feb 4 08:55:25 thoregon kernel: [15457.586362] radeon 0000:07:00.0:
R_008020_GRBM_SOFT_RESET=0x00007FEE
Feb 4 08:55:25 thoregon kernel: [15457.601387] radeon 0000:07:00.0:
R_008020_GRBM_SOFT_RESET=0x00000001
Feb 4 08:55:25 thoregon kernel: [15457.617378] radeon 0000:07:00.0:
R_008010_GRBM_STATUS=0x00003028
Feb 4 08:55:25 thoregon kernel: [15457.617384] radeon 0000:07:00.0:
R_008014_GRBM_STATUS2=0x00000002
Feb 4 08:55:25 thoregon kernel: [15457.617390] radeon 0000:07:00.0:
R_000E50_SRBM_STATUS=0x200000C0
Feb 4 08:55:25 thoregon kernel: [15457.618393] radeon 0000:07:00.0: GPU reset
succeed
Feb 4 08:55:25 thoregon kernel: [15457.623326] [drm] PCIE GART of 512M enabled
(table at 0x0000000000040000).
Feb 4 08:55:25 thoregon kernel: [15457.623361] radeon 0000:07:00.0: WB enabled
Feb 4 08:55:25 thoregon kernel: [15457.623367] [drm] fence driver on ring 0
use gpu addr 0x20000c00 and cpu addr 0xffff880328696c00
Feb 4 08:55:25 thoregon kernel: [15457.669623] [drm] ring test on 0 succeeded
in 1 usecs
Feb 4 08:55:25 thoregon kernel: [15457.669648] [drm] ib test on ring 0
succeeded in 1 usecs
Then, when the X server tries to unblank the screens it gets stuck. There no
longer is a mutex deadlock for the hung task detector to log, but SysRq+W shows
X in D state:
Feb 4 09:28:30 thoregon kernel: [17441.917129] SysRq : Changing Loglevel
Feb 4 09:28:30 thoregon kernel: [17441.917140] Loglevel set to 6
Feb 4 09:28:31 thoregon kernel: [17443.659030] SysRq : Show Blocked State
Feb 4 09:28:31 thoregon kernel: [17443.659040] task
PC stack pid father
Feb 4 09:28:31 thoregon kernel: [17443.659122] X D
ffff880337d50a00 0 3048 3027 0x00400004
Feb 4 09:28:31 thoregon kernel: [17443.659133] ffff880328709700
0000000000000082 ffff8802f2dc5c00 0000000000010a00
Feb 4 09:28:31 thoregon kernel: [17443.659143] ffff88031bf2bfd8
0000000000010a00 ffff88031bf2a000 ffff88031bf2bfd8
Feb 4 09:28:31 thoregon kernel: [17443.659152] 0000000000010a00
ffff880328709700 0000000000010a00 0000000000010a00
Feb 4 09:28:31 thoregon kernel: [17443.659161] Call Trace:
Feb 4 09:28:31 thoregon kernel: [17443.659177] [<ffffffff815ee9d7>] ?
schedule_timeout+0x157/0x220
Feb 4 09:28:31 thoregon kernel: [17443.659188] [<ffffffff8103fcb0>] ?
run_timer_softirq+0x240/0x240
Feb 4 09:28:31 thoregon kernel: [17443.659197] [<ffffffff8133ee39>] ?
radeon_fence_wait+0x239/0x3b0
Feb 4 09:28:31 thoregon kernel: [17443.659207] [<ffffffff8104f420>] ?
wake_up_bit+0x40/0x40
Feb 4 09:28:31 thoregon kernel: [17443.659215] [<ffffffff81352f77>] ?
radeon_ib_get+0x257/0x2e0
Feb 4 09:28:31 thoregon kernel: [17443.659224] [<ffffffff81354f4a>] ?
radeon_cs_ioctl+0x27a/0x4d0
Feb 4 09:28:31 thoregon kernel: [17443.659232] [<ffffffff812f4184>] ?
drm_ioctl+0x3e4/0x490
Feb 4 09:28:31 thoregon kernel: [17443.659240] [<ffffffff81354cd0>] ?
radeon_cs_finish_pages+0xa0/0xa0
Feb 4 09:28:31 thoregon kernel: [17443.659249] [<ffffffff810247e9>] ?
do_page_fault+0x199/0x420
Feb 4 09:28:31 thoregon kernel: [17443.659257] [<ffffffff810af4dc>] ?
mmap_region+0x1dc/0x570
Feb 4 09:28:31 thoregon kernel: [17443.659265] [<ffffffff810de636>] ?
do_vfs_ioctl+0x96/0x4e0
Feb 4 09:28:31 thoregon kernel: [17443.659273] [<ffffffff810deac9>] ?
sys_ioctl+0x49/0x90
Feb 4 09:28:31 thoregon kernel: [17443.659281] [<ffffffff815f18e2>] ?
system_call_fastpath+0x16/0x1b
Feb 4 09:28:41 thoregon kernel: [17453.327296] SysRq : Emergency Sync
Feb 4 09:28:41 thoregon kernel: [17453.327912] Emergency Sync complete
Apart from the X server the system was still working. I was able to ssh into it
and do a normal shutdown.
--
Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are watching the assignee of the bug.
More information about the dri-devel
mailing list