[Intel-gfx] Xorg deadlocking eventd

Bruno Prémont bonbons at linux-vserver.org
Sun Jul 19 22:08:07 CEST 2009


With KMS working as of comment #50 bug 20115 (KMS fails to completely
configure 855 chip) http://bugs.freedesktop.org/show_bug.cgi?id=20115
I end up with:
[    0.537796] [drm] Initialized drm 1.1.0 20060810
[    0.537892] i915 0000:00:02.0: power state changed by ACPI to D0
[    0.538218] ACPI: PCI Interrupt Link [LNKA] enabled at IRQ 11
[    0.538275] PCI: setting IRQ 11 as level-triggered
[    0.538281] i915 0000:00:02.0: PCI INT A -> Link[LNKA] -> GSI 11 (level, low) -> IRQ 11
[    0.538358] i915 0000:00:02.0: setting latency timer to 64
[    0.982383] [drm] DAC-6: set mode 640x480 0
[    1.445396] async/0 used greatest stack depth: 2544 bytes left
[    1.572312] fbcon: inteldrmfb (fb0) is primary device
[    1.617325] i915 KMS: p2 is 7, p1 is 2, forcing to p2=7, p1=2...
[    1.655329] render error detected, EIR: 0x00000010
[    1.655331] page table error
[    1.655333]   PGTBL_ER: 0x00000049
[    1.655336] [drm:i915_driver_irq_handler] *ERROR* EIR stuck: 0x00000010, masking
[    1.655340] render error detected, EIR: 0x00000010
[    1.655342] page table error
[    1.655343]   PGTBL_ER: 0x00000049
[    1.657791] [drm] LVDS-8: set mode 1400x1050 10
[    1.712381] Console: switching to colour frame buffer device 175x65
[    1.723642] [drm] fb0: inteldrmfb frame buffer device
[    1.723715] [drm] Initialized i915 1.6.0 20080730 for 0000:00:02.0 on minor 0
..
[11419.582605] Xorg          S b7828000  1708  7844   2532 0x00400000
[11419.582628]  da6d5dcc 00003086 dd91a7c0 b7828000 00004000 dd91a948 dc09a700 dc042bb0
[11419.582685]  dc09a700 00000000 00003246 00000000 dd826000 dd958800 da6d5e14 c1195095
[11419.582755]  00004000 00000004 da6d5e08 c10659f2 00000001 080000fb 0000001e dd8260a0
[11419.582826] Call Trace:
[11419.582833]  [<c1195095>] i915_wait_request+0x175/0x220
[11419.582856]  [<c10659f2>] ? do_mmap_pgoff+0x202/0x340
[11419.582876]  [<c1035370>] ? autoremove_wake_function+0x0/0x50
[11419.582896]  [<c1195186>] i915_gem_throttle_ioctl+0x46/0x70
[11419.582902]  [<c117d9c4>] drm_ioctl+0x164/0x380
[11419.582907]  [<c1195140>] ? i915_gem_throttle_ioctl+0x0/0x70
[11419.582927]  [<c104e28f>] ? find_get_page+0x1f/0x70
[11419.582947]  [<c104fd28>] ? filemap_fault+0x88/0x360
[11419.582952]  [<c117d860>] ? drm_ioctl+0x0/0x380
[11419.582956]  [<c108044e>] vfs_ioctl+0x6e/0x80
[11419.582976]  [<c10809ea>] do_vfs_ioctl+0x6a/0x5c0
[11419.582996]  [<c10619d6>] ? handle_mm_fault+0x1a6/0x4e0
[11419.583001]  [<c1080f79>] sys_ioctl+0x39/0x70
[11419.583006]  [<c1002e08>] sysenter_do_call+0x12/0x26
..
[11419.563311] events/0      D c13c2fa0  2648     4      2 0x00000000
[11419.563349]  dd82bf34 00000046 dd82c9f0 c13c2fa0 c13c2ff8 dd82cb78 dd9ad75c c13c2fdc
[11419.563392]  da76c380 dd82bf34 c101f6ae dd958810 ffffffff dd82c9f0 dd82bf58 c12b59bb
[11419.563449]  dd958814 dd958814 dd958814 dd82c9f0 dd958810 dd826cd4 dd826000 dd82bf68
[11419.563535] Call Trace:
[11419.563541]  [<c101f6ae>] ? set_next_entity+0x2e/0x70
[11419.563562]  [<c12b59bb>] __mutex_lock_slowpath+0x5b/0x90
[11419.563582]  [<c12b5869>] mutex_lock+0x19/0x20
[11419.563604]  [<c1194ed8>] i915_gem_retire_work_handler+0x28/0x70
[11419.563609]  [<c1194eb0>] ? i915_gem_retire_work_handler+0x0/0x70
[11419.563615]  [<c1031dfe>] worker_thread+0xde/0x190
[11419.563634]  [<c12b4e53>] ? schedule+0x203/0x340
[11419.563640]  [<c1035370>] ? autoremove_wake_function+0x0/0x50
[11419.563671]  [<c1031d20>] ? worker_thread+0x0/0x190
[11419.563676]  [<c1034fa4>] kthread+0x74/0x80
[11419.563681]  [<c1034f30>] ? kthread+0x0/0x80
[11419.563685]  [<c1003673>] kernel_thread_helper+0x7/0x14

This happens each time I start Xorg, right during Xorg startup.
(in addition, when killing (-KILL) Xorg from this state KMS does not
restart kernel console (it resets GPU or at least reprograms mode as
display is shortly off), thus blind typing is needed (I only see the _
cursor.

I'm running Gentoo userspace with:
 x11-drivers/xf86-video-intel-2.7.1
 x11-base/xorg-server-1.6.2-r1
 x11-libs/libdrm-2.4.11
 media-libs/mesa-7.4.4

Kernel is between 2.6.31-rc2 and 2.6.31 (at commit
7638d5322bd89d49e013a03fe2afaeb6d214fabd, Merge branch 'kmemleak' of)
with my patch from bug 20115, comment #50.

Any idea how to find out what is happening here? (interaction with the
machine via ssh is a pain as all those sessions are like frozen and new
connections freeze before getting bash prompt. (looks like eventd has
its word to say with regard to /dev/pts/*)

It would be great if i915_gem_retire_work_handler() could timeout
acquiring the mutex and try waiting again in a separate thread after
printk'ing a warning, thus not blocking eventd for too long.

Fixing the already hold mutex would be better, but this timeout and
retry would at least keep the system more or less working.

Bruno



More information about the Intel-gfx mailing list