GPU lockup CP stall for more than 10000msec on latest vanilla git

Markus Trippelsdorf markus at trippelsdorf.de
Mon Dec 17 14:55:34 PST 2012


On 2012.12.17 at 23:25 +0100, Markus Trippelsdorf wrote:
> On 2012.12.17 at 17:00 -0500, Alex Deucher wrote:
> > On Mon, Dec 17, 2012 at 4:48 PM, Markus Trippelsdorf
> > <markus at trippelsdorf.de> wrote:
> > > On 2012.12.17 at 16:32 -0500, Alex Deucher wrote:
> > >> On Mon, Dec 17, 2012 at 1:27 PM, Markus Trippelsdorf
> > >> <markus at trippelsdorf.de> wrote:
> > >> > As soon as I open the following website:
> > >> > http://www.boston.com/bigpicture/2012/12/2012_year_in_pictures_part_i.html
> > >> >
> > >> > my Radeon RS780 stalls (GPU lockup) leaving the machine unusable:
> > >>
> > >> Is this a regression?  Most likely a 3D driver bug unless you are only
> > >> seeing it with specific kernels.  What browser are you using and do
> > >> you have hw accelerated webgl, etc. enabled?  If so, what version of
> > >> mesa are you using?
> > >
> > > This is a regression, because it is caused by yesterdays merge of
> > > drm-next by Linus. IOW I only see this bug when running a
> > > v3.7-9432-g9360b53 kernel.
> > 
> > Can you bisect?  I'm guessing it may be related to the new DMA rings.  Possibly:
> > http://git.kernel.org/?p=linux/kernel/git/torvalds/linux.git;a=commitdiff;h=2d6cc7296d4ee128ab0fa3b715f0afde511f49c2
> 
> Yes, the commit above causes the issue. 
> 
>  2d6cc72  GPU lockups

With 2d6cc72 reverted I get:

Dec 17 23:09:35 x4 kernel: ------------[ cut here ]------------
Dec 17 23:09:35 x4 kernel: WARNING: at include/linux/kref.h:42 radeon_fence_ref+0x2c/0x40()
Dec 17 23:09:35 x4 kernel: Hardware name: System Product Name
Dec 17 23:09:35 x4 kernel: Pid: 182, comm: X Not tainted 3.7.0-09433-ge033059 #155
Dec 17 23:09:35 x4 kernel: Call Trace:
Dec 17 23:09:35 x4 kernel: [<ffffffff81059c94>] ? warn_slowpath_common+0x74/0xb0
Dec 17 23:09:35 x4 kernel: [<ffffffff8129de0c>] ? radeon_fence_ref+0x2c/0x40
Dec 17 23:09:35 x4 kernel: [<ffffffff8126a02c>] ? ttm_bo_cleanup_refs_and_unlock+0x17c/0x2c0
Dec 17 23:09:35 x4 kernel: [<ffffffff8126a6f4>] ? ttm_mem_evict_first+0x94/0x1d0
Dec 17 23:09:35 x4 kernel: [<ffffffff8126f9c2>] ? ttm_bo_man_get_node+0x62/0xb0
Dec 17 23:09:35 x4 kernel: [<ffffffff8126aaa1>] ? ttm_bo_mem_space+0x271/0x320
Dec 17 23:09:35 x4 kernel: [<ffffffff8126b0bd>] ? ttm_bo_move_buffer+0xdd/0x150
Dec 17 23:09:35 x4 kernel: [<ffffffff8126b1b9>] ? ttm_bo_validate+0x89/0xf0
Dec 17 23:09:35 x4 kernel: [<ffffffff8126b509>] ? ttm_bo_init+0x2e9/0x3a0
Dec 17 23:09:35 x4 kernel: [<ffffffff8129f84a>] ? radeon_bo_create+0x18a/0x200
Dec 17 23:09:35 x4 kernel: [<ffffffff8129f510>] ? radeon_bo_clear_va+0x40/0x40
Dec 17 23:09:35 x4 kernel: [<ffffffff812b0d42>] ? radeon_gem_object_create+0x92/0x160
Dec 17 23:09:35 x4 kernel: [<ffffffff812b113c>] ? radeon_gem_create_ioctl+0x6c/0x150
Dec 17 23:09:35 x4 kernel: [<ffffffff81252250>] ? drm_ioctl+0x420/0x4f0
Dec 17 23:09:35 x4 kernel: [<ffffffff812b10d0>] ? radeon_gem_pwrite_ioctl+0x20/0x20
Dec 17 23:09:35 x4 kernel: [<ffffffff810521a9>] ? __do_page_fault+0x1a9/0x490
Dec 17 23:09:35 x4 kernel: [<ffffffff810d1ac9>] ? mmap_region+0x169/0x560
Dec 17 23:09:35 x4 kernel: [<ffffffff810f7f84>] ? do_vfs_ioctl+0x2e4/0x4e0
Dec 17 23:09:35 x4 kernel: [<ffffffff810c0e19>] ? vm_mmap_pgoff+0x69/0x80
Dec 17 23:09:35 x4 kernel: [<ffffffff810f81cc>] ? sys_ioctl+0x4c/0xa0
Dec 17 23:09:35 x4 kernel: [<ffffffff814c2a12>] ? system_call_fastpath+0x16/0x1b
Dec 17 23:09:35 x4 kernel: ---[ end trace eb6036661a77c177 ]---
Dec 17 23:09:35 x4 kernel: BUG: unable to handle kernel paging request at ffff8803d9ee4bd8
Dec 17 23:09:35 x4 kernel: IP: [<ffffffff8129d395>] radeon_fence_wait_seq+0x85/0x440
Dec 17 23:09:35 x4 kernel: PGD 180c063 PUD 0
Dec 17 23:09:35 x4 kernel: Oops: 0000 [#1] SMP
Dec 17 23:09:35 x4 kernel: CPU 3
Dec 17 23:09:35 x4 kernel: Pid: 182, comm: X Tainted: G        W    3.7.0-09433-ge033059 #155 System manufacturer System Product Name/M4A78T-E
Dec 17 23:09:35 x4 kernel: RIP: 0010:[<ffffffff8129d395>]  [<ffffffff8129d395>] radeon_fence_wait_seq+0x85/0x440
Dec 17 23:09:35 x4 kernel: RSP: 0018:ffff880210cc7a38  EFLAGS: 00010282
Dec 17 23:09:35 x4 kernel: RAX: ffff880210cc7a90 RBX: ffff88020674c970 RCX: 0000000000000001
Dec 17 23:09:35 x4 kernel: RDX: 000000000605b580 RSI: 0000000000000058 RDI: ffff8801c7f7dc80
Dec 17 23:09:35 x4 kernel: RBP: ffff8803d9ee4bd8 R08: 0000000000000001 R09: 00000000000002a9
Dec 17 23:09:35 x4 kernel: R10: 00000000000002a8 R11: 0000000000000006 R12: ffff880210ee6981
Dec 17 23:09:35 x4 kernel: R13: 000000000605b580 R14: ffff8801c7f7dc80 R15: ffff8802161864f8
Dec 17 23:09:35 x4 kernel: FS:  00007f5ee88f4880(0000) GS:ffff88021fd80000(0000) knlGS:0000000000000000
Dec 17 23:09:35 x4 kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Dec 17 23:09:35 x4 kernel: CR2: ffff8803d9ee4bd8 CR3: 0000000210c63000 CR4: 00000000000007e0
Dec 17 23:09:35 x4 kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Dec 17 23:09:35 x4 kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Dec 17 23:09:35 x4 kernel: Process X (pid: 182, threadinfo ffff880210cc6000, task ffff880215f45730)
Dec 17 23:09:35 x4 kernel: Stack:
Dec 17 23:09:35 x4 kernel: ffffffff8129de0c 000000000605b580 ffff8803d9ee4080 0000000000000010
Dec 17 23:09:35 x4 kernel: ffff880210cc7aa8 ffff880201cc7a68 ffff880210cc7a90 000000010177c177
Dec 17 23:09:35 x4 kernel: 00000000000000c7 0000000000000001 ffff88020674c890 0000000000000286
Dec 17 23:09:35 x4 kernel: Call Trace:
Dec 17 23:09:35 x4 kernel: [<ffffffff8129de0c>] ? radeon_fence_ref+0x2c/0x40
Dec 17 23:09:35 x4 kernel: [<ffffffff8129dc32>] ? radeon_fence_wait+0x22/0x60
Dec 17 23:09:35 x4 kernel: [<ffffffff8126a06d>] ? ttm_bo_cleanup_refs_and_unlock+0x1bd/0x2c0
Dec 17 23:09:35 x4 kernel: [<ffffffff8126a6f4>] ? ttm_mem_evict_first+0x94/0x1d0
Dec 17 23:09:35 x4 kernel: [<ffffffff8126f9c2>] ? ttm_bo_man_get_node+0x62/0xb0
Dec 17 23:09:35 x4 kernel: [<ffffffff8126aaa1>] ? ttm_bo_mem_space+0x271/0x320
Dec 17 23:09:35 x4 kernel: [<ffffffff8126b0bd>] ? ttm_bo_move_buffer+0xdd/0x150
Dec 17 23:09:35 x4 kernel: [<ffffffff8126b1b9>] ? ttm_bo_validate+0x89/0xf0
Dec 17 23:09:35 x4 kernel: [<ffffffff8126b509>] ? ttm_bo_init+0x2e9/0x3a0
Dec 17 23:09:35 x4 kernel: [<ffffffff8129f84a>] ? radeon_bo_create+0x18a/0x200
Dec 17 23:09:35 x4 kernel: [<ffffffff8129f510>] ? radeon_bo_clear_va+0x40/0x40
Dec 17 23:09:35 x4 kernel: [<ffffffff812b0d42>] ? radeon_gem_object_create+0x92/0x160
Dec 17 23:09:35 x4 kernel: [<ffffffff812b113c>] ? radeon_gem_create_ioctl+0x6c/0x150
Dec 17 23:09:35 x4 kernel: [<ffffffff81252250>] ? drm_ioctl+0x420/0x4f0
Dec 17 23:09:35 x4 kernel: [<ffffffff812b10d0>] ? radeon_gem_pwrite_ioctl+0x20/0x20
Dec 17 23:09:35 x4 kernel: [<ffffffff810521a9>] ? __do_page_fault+0x1a9/0x490
Dec 17 23:09:35 x4 kernel: [<ffffffff810d1ac9>] ? mmap_region+0x169/0x560
Dec 17 23:09:35 x4 kernel: [<ffffffff810f7f84>] ? do_vfs_ioctl+0x2e4/0x4e0
Dec 17 23:09:35 x4 kernel: [<ffffffff810c0e19>] ? vm_mmap_pgoff+0x69/0x80
Dec 17 23:09:35 x4 kernel: [<ffffffff810f81cc>] ? sys_ioctl+0x4c/0xa0
Dec 17 23:09:35 x4 kernel: [<ffffffff814c2a12>] ? system_call_fastpath+0x16/0x1b
Dec 17 23:09:35 x4 kernel: Code: c4 0f 87 77 01 00 00 41 89 df bb 01 00 00 00 44 89 ee 4c 89 f7 e8 ec 5a 01 00 45 85 ff 0f 88 43 03 00 00 84 db 0f 84 57 02 00 00 <48> 8b 45 00 4c 39 e0 0f 83 19 02 00 00 48 8b 44 24 08 48 c1 e0
Dec 17 23:09:35 x4 kernel: RIP  [<ffffffff8129d395>] radeon_fence_wait_seq+0x85/0x440
Dec 17 23:09:35 x4 kernel: RSP <ffff880210cc7a38>
Dec 17 23:09:35 x4 kernel: CR2: ffff8803d9ee4bd8
Dec 17 23:09:35 x4 kernel: ---[ end trace eb6036661a77c178 ]---
Dec 17 23:09:35 x4 kernel: [drm:drm_release] *ERROR* Device busy: 1

-- 
Markus


More information about the dri-devel mailing list