[Intel-gfx] Question about how to troubleshoot sandybridge kernel opps and subsequest GPU lockup

Daniel Vetter daniel at ffwll.ch
Mon Oct 24 08:46:56 CEST 2011


On Sun, Oct 23, 2011 at 11:12:21PM -0500, James R. Leu wrote:
> I'm running wow in wine on 64 bit fedora rawhide on a dell vostro 3550
> (i5 with integrated GPU).
> 
> I'm reliably able to produce 2 types of crashes:
> - wow freezes, but I can get to text console, in this case I'm able to
>   grab a kernel stack trace  (below) prior to seeing the normal
>   [drm:i915_wait_request] *ERROR* i915_wait_request returns -11 (awaiting 452684 at 452608, next 452686)

I'm pretty sure that below that line there's a gpu hang report. If that's
the case, the please grab everything in /sys/kernel/debug/dri, put it into
a tar.gz and attach it (you need to do this _after_ the machine is hung,
the kernel will write a gpu crash dump into i915_error_state).

The userspace parts of the i915 driver are very important for gpu hangs,
so please attach the version of mesa, libdrm and xf86-video-intel you've
installed.

Also please attach all your i915.ko module options as listed in
/sys/module/i915/parameters

> - the other is a complete freeze of the system, hard reset required, nothing logged to /var/log/messages

It's rather likely that this is the same issue as above. Depending upon
exact circumstances the gpu can take down the entire system.

> Is there any value in me creating a bug report for this, it seems to be a pretty common issue.
> Is there any use in my trying different kernel command line optios for
> the i915 driver or config options to the xorg intel driver?

Yes, gpu hangs are one of the more common issues, but until you've
submitted the error_state there's no way to diagnose the issue and tell
whether we have got a report already.

> I have the various git trees pulled out (I was looking for recent changes that might be related
> to this issue).  I'm capable of building and installing from these git trees if there are specific
> bits that I should test.
> 
> [  939.830806] ------------[ cut here ]------------
> [  939.830814] WARNING: at drivers/gpu/drm/i915/i915_drv.c:372 gen6_gt_force_wake_put+0x29/0x51 [i915]()
> [  939.830816] Hardware name: Vostro 3550
> [  939.830818] Modules linked in: snd_seq_dummy fuse ip6table_filter ip6_tables ebtable_nat ebtables xt_state xt_CHECKSUM iptable_mangle ppdev parport_pc lp parport vboxpci vboxnetadp vboxnetflt vboxdrv bridge stp llc tun rfcomm bnep ipt_MASQUERADE iptable_nat nf_nat nf_conntrack_ipv4 nf_conntrack nf_defrag_ipv4 snd_hda_codec_hdmi snd_hda_codec_idt uvcvideo videodev btusb media bluetooth v4l2_compat_ioctl32 arc4 snd_hda_intel snd_hda_codec snd_hwdep snd_seq snd_seq_device snd_pcm iwlagn microcode mac80211 dell_laptop iTCO_wdt r8169 i2c_i801 snd_timer cfg80211 snd mii iTCO_vendor_support dcdbas dell_wmi sparse_keymap soundcore rfkill snd_page_alloc virtio_net kvm_intel kvm binfmt_misc wmi i915 drm_kms_helper drm i2c_algo_bit i2c_core video [last unloaded: scsi_wait_scan]
> [  939.830926] Pid: 0, comm: swapper Tainted: G        WC  3.1.0-0.rc10.git0.1.fc17.x86_64 #1
> [  939.830928] Call Trace:
> [  939.830930]  <IRQ [<ffffffff8105c3a0>] warn_slowpath_common+0x83/0x9b
> [  939.830941]  [<ffffffff8105c3d2>] warn_slowpath_null+0x1a/0x1c
> [  939.830952]  [<ffffffffa006b624>] gen6_gt_force_wake_put+0x29/0x51 [i915]
> [  939.830963]  [<ffffffffa006f45f>] i915_read32+0x44/0x6b [i915]
> [  939.830975]  [<ffffffffa00724a9>] i915_hangcheck_elapsed+0xe8/0x1f8 [i915]
> [  939.831027]  [<ffffffff81062ddd>] irq_exit+0x5d/0xcf
> [  939.831032]  [<ffffffff8150de91>] smp_apic_timer_interrupt+0x7c/0x8a
> [  939.831036]  [<ffffffff8150bd73>] apic_timer_interrupt+0x73/0x80
> [  939.831038]  <EOI [<ffffffff81014ded>] ? paravirt_read_tsc+0x9/0xd
> [  939.831046]  [<ffffffff81297075>] ? intel_idle+0xe5/0x10c
> [  939.831050]  [<ffffffff81297071>] ? intel_idle+0xe1/0x10c
> [  939.831054]  [<ffffffff813e14fe>] cpuidle_idle_call+0x11c/0x1fe
> [  939.831059]  [<ffffffff8100e2ef>] cpu_idle+0xab/0x101
> [  939.831063]  [<ffffffff814df673>] rest_init+0xd7/0xde
> [  939.831067]  [<ffffffff814df59c>] ? csum_partial_copy_generic+0x16c/0x16c
> [  939.831072]  [<ffffffff81d53bb0>] start_kernel+0x3dd/0x3ea
> [  939.831076]  [<ffffffff81d532c4>] x86_64_start_reservations+0xaf/0xb3
> [  939.831081]  [<ffffffff81d53140>] ? early_idt_handlers+0x140/0x140
> [  939.831085]  [<ffffffff81d533ca>] x86_64_start_kernel+0x102/0x111
> [  939.831088] ---[ end trace f5cba358bac6b7e5 ]---

This WARN here is a possible sideeffect of a dying gpu. Independant, but
rather harmless bug. Unfortunately no easy solution, hence no patch atm.

Yours, Daniel
-- 
Daniel Vetter
Mail: daniel at ffwll.ch
Mobile: +41 (0)79 365 57 48



More information about the Intel-gfx mailing list