[Intel-gfx] SNB GPU hang

Daniel Vetter daniel at ffwll.ch
Tue Nov 13 20:48:40 CET 2012


On Tue, Nov 06, 2012 at 02:11:41PM +0800, Daniel J Blueman wrote:
> On stock 3.6.5 at boot+403586.244s, I hit this:
> 
> [drm:i915_hangcheck_hung] *ERROR* Hangcheck timer elapsed... GPU hung
> [drm] capturing error event; look for more information in
> /debug/dri/0/i915_error_state
> 
> At boot+403596.318s, I see in Xorg.0.log:
> 
> [mi] Increasing EQ size to 512 to prevent dropped events.
> [mi] EQ processing has resumed after 269 dropped events.
> [mi] This may be caused my a misbehaving driver monopolizing the
> server's resources.
> (WW) intel(0): I830DRI2GetMSC:1361 get vblank counter failed: Invalid argument
> 
> This was preceded a while before by:
> 
> (EE) [mi] EQ overflowing.  Additional events will be discarded until
> existing events are processed.
> (EE)
> (EE) Backtrace:
> (EE) 0: /usr/bin/X (xorg_backtrace+0x36) [0x7ffa49022ac6]
> (EE) 1: /usr/bin/X (mieqEnqueue+0x26b) [0x7ffa49003eab]
> (EE) 2: /usr/bin/X (0x7ffa48e7a000+0x6a492) [0x7ffa48ee4492]
> (EE) 3: /usr/lib/xorg/modules/input/evdev_drv.so
> (0x7ffa43d7d000+0x5f34) [0x7ffa43d82f34]
> (EE) 4: /usr/bin/X (0x7ffa48e7a000+0x936c7) [0x7ffa48f0d6c7]
> (EE) 5: /usr/bin/X (0x7ffa48e7a000+0xbce38) [0x7ffa48f36e38]
> (EE) 6: /lib/x86_64-linux-gnu/libpthread.so.0 (0x7ffa481a0000+0xfcb0)
> [0x7ffa481afcb0]
> (EE) 7: /lib/x86_64-linux-gnu/libc.so.6 (ioctl+0x7) [0x7ffa46ef0527]
> (EE) 8: /usr/lib/x86_64-linux-gnu/libdrm.so.2 (drmIoctl+0x28) [0x7ffa47f97328]
> (EE) 9: /usr/lib/x86_64-linux-gnu/libdrm_intel.so.1
> (0x7ffa45970000+0x6af0) [0x7ffa45976af0]
> (EE) 10: /usr/lib/x86_64-linux-gnu/libdrm_intel.so.1
> (0x7ffa45970000+0x72be) [0x7ffa459772be]
> (EE) 11: /usr/lib/xorg/modules/drivers/intel_drv.so
> (0x7ffa45b90000+0x18194) [0x7ffa45ba8194]
> (EE) 12: /usr/bin/X (0x7ffa48e7a000+0x5130f) [0x7ffa48ecb30f]
> (EE) 13: /usr/bin/X (0x7ffa48e7a000+0x55a51) [0x7ffa48ecfa51]
> (EE) 14: /usr/bin/X (0x7ffa48e7a000+0x4456a) [0x7ffa48ebe56a]
> (EE) 15: /lib/x86_64-linux-gnu/libc.so.6 (__libc_start_main+0xed)
> [0x7ffa46e2576d]
> (EE) 16: /usr/bin/X (0x7ffa48e7a000+0x448ad) [0x7ffa48ebe8ad]
> (EE)
> (EE) [mi] These backtraces from mieqEnqueue may point to a culprit
> higher up the stack.
> (EE) [mi] mieq is *NOT* the cause.  It is a victim.
> (EE) [mi] EQ overflow continuing.  100 events have been dropped.
> 
> Have you guys seen this before? Hardware is SNB; just in GNOME3 UI
> with terminals, text editors, PDFs, mail and firefox.
> 
> Full logs are at:
> http://quora.org/2012/i915/i915_error_state
> http://quora.org/2012/i915/dmesg
> http://quora.org/2012/i915/Xorg.0.log
> 
> Box is still booted, but X restarted in case further detail in
> /sys/kernel/debug/i915 is useful...

Sorry for the long delay. Your error_state is very peculiar, somehow the
mbox registers failed to update properly. Now we've just got an update
from the hw team that we should do something slightly different there.

To avoid loosing track again of this, can you please file a bug on
bugs.freedekstop.org against DRI -> DRM/Intel and attach the above
error_state? I should get around to writing a testpatch in the next few
days.

Thanks, Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch



More information about the Intel-gfx mailing list