Hi, <div class="gmail_quote">在 2011年10月17日下午2:34， <<a href="mailto:chenhc@lemote.com">chenhc@lemote.com</a>>写道： <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;"> If I start X but switch to the console, then do suspend & resume, "GPU reset" hardly happen. but there is a new problem that the IRQ of radeon card is disabled. Maybe "GPU reset" has something to do with "IRQ disabled"? I have tried "irqpoll", it doesn't fix this problem. [ 571.914062] irq 6: nobody cared (try booting with the "irqpoll" option) [ 571.914062] Call Trace: [ 571.914062] [<ffffffff806f3248>] dump_stack+0x8/0x34 [ 571.914062] [<ffffffff8027e1e4>] __report_bad_irq.clone.6+0x44/0x15c [ 571.914062] [<ffffffff8027e584>] note_interrupt+0x204/0x2a0 [ 571.914062] [<ffffffff8027c7cc>] handle_irq_event_percpu+0x19c/0x1f8 [ 571.914062] [<ffffffff8027c890>] handle_irq_event+0x68/0xa8 [ 571.914062] [<ffffffff8027f038>] handle_level_irq+0xd8/0x13c [ 571.914062] [<ffffffff8027bec8>] generic_handle_irq+0x48/0x58 [ 571.914062] [<ffffffff80204574>] do_IRQ+0x18/0x24 [ 571.914062] [<ffffffff8020152c>] mach_irq_dispatch+0xf0/0x194 [ 571.914062] [<ffffffff80202a40>] ret_from_irq+0x0/0x4 [ 571.914062] [ 571.914062] handlers: [ 571.914062] [<ffffffff8053bba8>] radeon_driver_irq_handler_kms P.S.: use the latest kernel from git, and irq6 is not shared by other devices. </blockquote><div>Does fence_wait depends on GPU's interrupt? If yes, then can I say "GPU lockup" is caused by unexpected disabling of GPU's irq?</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;"> > Hi Alex, Michel > > 2011/10/5 Alex Deucher <<a href="mailto:alexdeucher@gmail.com">alexdeucher@gmail.com</a>> > >> 2011/10/5 Michel D鋘zer <<a href="mailto:michel@daenzer.net">michel@daenzer.net</a>>: >> > On Don, 2011-09-29 at 17:17 +0800, Chen Jie wrote: >> >> >> >> We got occasionally "GPU lockup" after resuming from suspend(on >> mipsel >> >> platform with a mips64 compatible CPU and rs780e, the kernel is >> >> 3.1.0-rc8 64bit). Related kernel message: >> > >> > [...] >> > >> >> [ 177.085937] radeon 0000:01:05.0: GPU lockup CP stall for more than >> >> 10019msec >> >> [ 177.089843] ------------[ cut here ]------------ >> >> [ 177.097656] WARNING: at drivers/gpu/drm/radeon/radeon_fence.c:267 >> >> radeon_fence_wait+0x25c/0x33c() >> >> [ 177.105468] GPU lockup (waiting for 0x000013C3 last fence id >> >> 0x000013AD) >> >> [ 177.113281] Modules linked in: psmouse serio_raw >> >> [ 177.117187] Call Trace: >> >> [ 177.121093] [<ffffffff806f3e7c>] dump_stack+0x8/0x34 >> >> [ 177.125000] [<ffffffff8022e4f4>] warn_slowpath_common+0x78/0xa0 >> >> [ 177.132812] [<ffffffff8022e5b8>] warn_slowpath_fmt+0x38/0x44 >> >> [ 177.136718] [<ffffffff80522ed8>] radeon_fence_wait+0x25c/0x33c >> >> [ 177.144531] [<ffffffff804e9e70>] ttm_bo_wait+0x108/0x220 >> >> [ 177.148437] [<ffffffff8053b478>] radeon_gem_wait_idle_ioctl >> >> +0x80/0x114 >> >> [ 177.156250] [<ffffffff804d2fe8>] drm_ioctl+0x2e4/0x3fc >> >> [ 177.160156] [<ffffffff805a1820>] radeon_kms_compat_ioctl+0x28/0x38 >> >> [ 177.167968] [<ffffffff80311a04>] compat_sys_ioctl+0x120/0x35c >> >> [ 177.171875] [<ffffffff80211d18>] handle_sys+0x118/0x138 >> >> [ 177.179687] ---[ end trace 92f63d998efe4c6d ]--- >> >> [ 177.187500] radeon 0000:01:05.0: GPU softreset >> >> [ 177.191406] radeon 0000:01:05.0: R_008010_GRBM_STATUS=0xF57C2030 >> >> [ 177.195312] radeon 0000:01:05.0: >> R_008014_GRBM_STATUS2=0x00111103 >> >> [ 177.203125] radeon 0000:01:05.0: R_000E50_SRBM_STATUS=0x20023040 >> >> [ 177.363281] radeon 0000:01:05.0: Wait for MC idle timedout ! >> > >> > [...] >> > >> >> What may cause a "GPU lockup"? >> > >> > Lots of things... The most common cause is an incorrect command stream >> > sent to the GPU by userspace or the kernel. >> > >> >> Why reset didn't work? >> > >> > Might be related to 'Wait for MC idle timedout !', but I don't know >> > offhand what could be up with that. >> > >> > >> >> BTW, one question: >> >> I got 'RADEON_IS_PCI | RADEON_IS_IGP' in rdev->flags, which causes >> >> need_dma32 was set. >> >> Is it correct? (drivers/char/agp is not available on mips, could that >> >> be the reason?) >> > >> > Not sure, Alex? >> >> You don't AGP for newer IGP cards (rs4xx+). It gets set by default if >> the card is not AGP or PCIE. That should be changed as only the >> legacy r1xx PCI GART block has that limitation. I'll send a patch out >> shortly. >> >> Got it, thanks for the reply. > </blockquote></div>