[Bug 67931] New: [Bisected]xinit cases call trace and system hang

Thu Aug 8 20:07:12 PDT 2013

https://bugs.freedesktop.org/show_bug.cgi?id=67931

          Priority: high
            Bug ID: 67931
          Assignee: intel-gfx-bugs at lists.freedesktop.org
           Summary: [Bisected]xinit cases call trace and system hang
        QA Contact: intel-gfx-bugs at lists.freedesktop.org
          Severity: major
    Classification: Unclassified
                OS: Linux (All)
          Reporter: huax.lu at intel.com
          Hardware: All
            Status: NEW
           Version: unspecified
         Component: DRM/Intel
           Product: DRI

Created attachment 83871
  --> https://bugs.freedesktop.org/attachment.cgi?id=83871&action=edit
dmesg

System Environment:
--------------------------
Platform:  PNV/ILK/SNB/IVB/HSW
Kernel:    (drm-intel-nightly)254329f9a08cc3b0d5e4a877c6ff13cf9ba4fae7

Bug detailed description:
-----------------------------
Run xinit, call trace appears and system hang.It happens on -nightly, -queued
kernel. It works well on -fixes kernel.

Bisect shows:4695ec93e3484243574f68072a27d1781d41a5a5 is the first bad commit.
commit 4695ec93e3484243574f68072a27d1781d41a5a5
Author: Ben Widawsky <ben at bwidawsk.net>
Date:   Wed Jul 31 17:00:17 2013 -0700

    drm/i915: create vmas at execbuf

    In order to transition more of our code over to using a VMA instead of
    an <OBJ, VM> pair - we must have the vma accessible at execbuf time. Up
    until now, we've only had a VMA when actually binding an object.

    The previous patch helped handle the distinction on bound vs. unbound.
    This patch will help us catch leaks, and other issues before we actually
    shuffle a bunch of stuff around.

    The subsequent patch to fix up the rest of execbuf should be mostly just
    moving code around, and this is the major functional change.

    v2: Release table_lock earlier so vma allocation needn't be atomic.
    (Chris)

    Signed-off-by: Ben Widawsky <ben at bwidawsk.net>
    Signed-off-by: Daniel Vetter <daniel.vetter at ffwll.ch>

dmesg:
[   73.571553] BUG: unable to handle kernel NULL pointer dereference at
00000018
[   73.571603] IP: [<f80a38c0>] drm_mm_remove_node+0x47/0x9e [drm]
[   73.571642] *pde = 00000000
[   73.571661] Oops: 0000 [#1] SMP
[   73.571683] Modules linked in: netconsole configfs ipv6 dm_mod
snd_hda_codec_hdmi snd_hda_codec_realtek dcdbas pcspkr serio_raw i2c_i801
iTCO_wdt iTCO_vendor_support snd_hda_intel snd_hda_codec snd_hwdep lpc_ich
snd_pcm mfd_core snd_page_alloc snd_timer snd soundcore acpi_cpufreq i915 video
button drm_kms_helper drm mperf freq_table [last unloaded: netconsole]
[   73.571916] CPU: 0 PID: 3760 Comm: X Not tainted
3.11.0-rc2_drm-intel-next-queued_4695ec_20130808_+ #6612
[   73.571962] Hardware name: Dell Inc. OptiPlex 990/0DXWW6, BIOS A02
02/26/2011
[   73.571997] task: f5b26ce0 ti: c31ce000 task.ti: c31ce000
[   73.572024] EIP: 0060:[<f80a38c0>] EFLAGS: 00213246 CPU: 0
[   73.572055] EIP is at drm_mm_remove_node+0x47/0x9e [drm]
[   73.572082] EAX: c3280980 EBX: 00000000 ECX: 00000000 EDX: 00000000
[   73.572113] ESI: 00000000 EDI: 00000000 EBP: c357bb80 ESP: c31cfd1c
[   73.572144]  DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
[   73.572170] CR0: 80050033 CR2: 00000018 CR3: 030e4000 CR4: 000407d0
[   73.572201] Stack:
[   73.572213]  c3280980 f4e8d254 c3280980 c357bb80 f8180f7a c21a8500 f818444b
ffffffe4
[   73.572272]  00000000 00300000 f52e3c00 f4e8c000 7fe00000 00001000 00000000
00300000
[   73.572332]  00000000 c30a0400 c21a8500 c357b970 f4e8cc2c f8185e87 00000000
00000000
[   73.572391] Call Trace:
[   73.572413]  [<f8180f7a>] ? i915_gem_vma_destroy+0x37/0x3f [i915]
[   73.572449]  [<f818444b>] ? i915_gem_object_pin+0x3c5/0x509 [i915]
[   73.572485]  [<f8185e87>] ?
i915_gem_execbuffer_reserve_object.isra.12+0x70/0x192 [i915]
[   73.572529]  [<f8186191>] ? i915_gem_execbuffer_reserve+0x1e8/0x2fb [i915]
[   73.572568]  [<f8186b00>] ? i915_gem_do_execbuffer.isra.18+0x4a0/0xd5f
[i915]
[   73.572609]  [<f8181550>] ? i915_gem_obj_bound_any+0x28/0x43 [i915]
[   73.572645]  [<f8187930>] ? i915_gem_execbuffer2+0x12e/0x1c2 [i915]
[   73.572680]  [<f8187802>] ? i915_gem_execbuffer+0x443/0x443 [i915]
[   73.572713]  [<f809cc1e>] ? drm_ioctl+0x23d/0x323 [drm]
[   73.572744]  [<f8187802>] ? i915_gem_execbuffer+0x443/0x443 [i915]
[   73.572777]  [<c02a065a>] ? handle_pte_fault+0x5a6/0x5e3
[   73.572806]  [<f809c9e1>] ? drm_copy_field+0x47/0x47 [drm]
[   73.572835]  [<c02c04fc>] ? vfs_ioctl+0x18/0x21
[   73.572858]  [<c02c0ec8>] ? do_vfs_ioctl+0x3ec/0x42c
[   73.572885]  [<c08778c9>] ? __do_page_fault+0x400/0x43b
[   73.572911]  [<c087787d>] ? __do_page_fault+0x3b4/0x43b
[   73.572938]  [<c0236d5a>] ? __set_current_blocked+0x24/0x35
[   73.572966]  [<c02c0f51>] ? SyS_ioctl+0x49/0x74
[   73.572990]  [<c087945a>] ? sysenter_do_call+0x12/0x22
[   73.573017]  [<c0870000>] ? create_subvol+0x20f/0x59c
[   73.573042] Code: 1c 8b 18 74 24 01 fe 3b 73 18 75 02 0f 0b 8b 70 08 8b 58
0c 89 5e 04 89 33 c7 40 08 00 01 10 00 c7 40 0c 00 02 20 00 eb 09 01 fe <3b> 73
18 74 02 0f 0b 8b 72 10 8d 6a 08 f7 c6 01 00 00 00 75 0a
[   73.573350] EIP: [<f80a38c0>] drm_mm_remove_node+0x47/0x9e [drm] SS:ESP
0068:c31cfd1c
[   73.573398] CR2: 0000000000000018
[   73.578083] ---[ end trace 95ad56d39717da34 ]---

BTW, I can't find the bisect commit on latest -queued branch. This issue
doesn't happen on the latest commit(6d2b888569d).

Reproduce steps:
----------------------------
1. xinit

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.freedesktop.org/archives/intel-gfx-bugs/attachments/20130809/40ed9413/attachment.html>