[Bug 90732] New: [BDW/BSW Regression]igt/gem_reloc_vs_gpu/forked-faulting-reloc-thrashing-hang causes GPU reset fail

Thu May 28 19:19:26 PDT 2015

https://bugs.freedesktop.org/show_bug.cgi?id=90732

            Bug ID: 90732
           Summary: [BDW/BSW
                    Regression]igt/gem_reloc_vs_gpu/forked-faulting-reloc-
                    thrashing-hang causes GPU reset fail
           Product: DRI
           Version: unspecified
          Hardware: All
                OS: Linux (All)
            Status: NEW
          Severity: major
          Priority: high
         Component: DRM/Intel
          Assignee: intel-gfx-bugs at lists.freedesktop.org
          Reporter: huax.lu at intel.com
        QA Contact: intel-gfx-bugs at lists.freedesktop.org
                CC: intel-gfx-bugs at lists.freedesktop.org

Created attachment 116131
  --> https://bugs.freedesktop.org/attachment.cgi?id=116131&action=edit
error state

==System Environment==
--------------------------
Regression: yes

good commit:  65de797816eadb227c45b0127d7ff92410fa3814(dinq)
bad commit: 99c044d7d5cc65661436f271754c011d0f1a02de(dinq)

Non-working platforms: BDW/BSW

==kernel==
--------------------------
drm-intel-nightly/b44f6771cba2cc90525d037445330ed766377aa9
commit b44f6771cba2cc90525d037445330ed766377aa9
Author: Daniel Vetter <daniel.vetter at ffwll.ch>
Date:   Thu May 28 13:39:29 2015 +0200

    drm-intel-nightly: 2015y-05m-28d-11h-38m-51s UTC integration manifest

==Bug detailed description==
-----------------------------
Run ./gem_reloc_vs_gpu --run-subtest forked-faulting-reloc-thrashing-hang, gpu
reset fail.
Following cases also have this issue:
igt at gem_reloc_vs_gpu@forked-interruptible-thrashing-hang
igt at gem_reloc_vs_gpu@forked-thrashing-hang

dmesg:
[   91.753899] [drm] stuck on blitter ring
[   91.754663] [drm] GPU HANG: ecode 8:2:0xe77ffff2, in gem_reloc_vs_gp [4986],
reason: Ring hung, action: reset
[   91.754665] [drm] GPU hangs can indicate a bug anywhere in the entire gfx
stack, including userspace.
[   91.754666] [drm] Please file a _new_ bug report on bugs.freedesktop.org
against DRI -> DRM/Intel
[   91.754668] [drm] drm/i915 developers can then reassign to the right
component if it's not a kernel issue.
[   91.754669] [drm] The gpu crash dump is required to analyze gpu hangs, so
please always attach it.
[   91.754670] [drm] GPU crash dump saved to /sys/class/drm/card0/error
[   91.754705] [drm:i915_reset_and_wakeup] resetting chip
[  101.748383] [drm:i915_gem_wait_for_error.part.25 [i915]] *ERROR* Timed out
waiting for the gpu reset to complete
[  101.748413] [drm:i915_gem_wait_for_error.part.25 [i915]] *ERROR* Timed out
waiting for the gpu reset to complete
[  101.748442] [drm:i915_gem_wait_for_error.part.25 [i915]] *ERROR* Timed out
waiting for the gpu reset to complete
[  101.748477] [drm:i915_gem_wait_for_error.part.25 [i915]] *ERROR* Timed out
waiting for the gpu reset to complete
[  101.748500] [drm:i915_gem_wait_for_error.part.25 [i915]] *ERROR* Timed out
waiting for the gpu reset to complete
[  101.748525] [drm:i915_gem_wait_for_error.part.25 [i915]] *ERROR* Timed out
waiting for the gpu reset to complete
[  101.748547] [drm:i915_gem_wait_for_error.part.25 [i915]] *ERROR* Timed out
waiting for the gpu reset to complete
[  101.748570] [drm:i915_gem_wait_for_error.part.25 [i915]] *ERROR* Timed out
waiting for the gpu reset to complete
[  101.748617] [drm:i915_gem_wait_for_error.part.25 [i915]] *ERROR* Timed out
waiting for the gpu reset to complete
[  101.750656] Setting dangerous option prefault_disable - tainting kernel
[  101.751194] Setting dangerous option prefault_disable - tainting kernel
[  101.751291] Setting dangerous option prefault_disable - tainting kernel

[  240.060726] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
message.
[  240.060767] kworker/u16:3   D ffff8800a7c77aa8     0  1237      2 0x00000000
[  240.060797] Workqueue: i915-hangcheck i915_hangcheck_elapsed [i915]
[  240.060799]  ffff8800a7c77aa8 ffff880002ae0000 ffff8800a7f82120
ffff8800a7c77ad8
[  240.060802]  0000000000000246 0000000000000000 ffff8800a7c78000
0000000000000246
[  240.060805]  0000000000000000 ffff88000355c068 ffff8800a7f82120
ffff8800a7c77ac8
[  240.060808] Call Trace:
[  240.060814]  [<ffffffff81896db4>] schedule+0x75/0x84
[  240.060816]  [<ffffffff81897011>] schedule_preempt_disabled+0xe/0x10
[  240.060818]  [<ffffffff818986c5>] mutex_lock_nested+0x17c/0x2cb
[  240.060833]  [<ffffffffa0094a13>] ? i915_reset+0x3a/0x13e [i915]
[  240.060847]  [<ffffffffa0094a13>] i915_reset+0x3a/0x13e [i915]
[  240.060866]  [<ffffffffa00c80e2>] i915_reset_and_wakeup+0xd3/0x133 [i915]
[  240.060885]  [<ffffffffa00cbd51>] i915_handle_error+0x5ab/0x5bd [i915]
[  240.060905]  [<ffffffffa00dda30>] ? gen6_read32+0x11a/0x18b [i915]
[  240.060910]  [<ffffffff8109352f>] ? vprintk_default+0x1d/0x1f
[  240.060913]  [<ffffffff8188f3e9>] ? printk+0x46/0x48
[  240.060930]  [<ffffffffa00cc14f>] i915_hangcheck_elapsed+0x3a3/0x3c3 [i915]
[  240.060933]  [<ffffffff8105ab88>] ? process_one_work+0x1ba/0x409
[  240.060935]  [<ffffffff8105abf3>] process_one_work+0x225/0x409
[  240.060937]  [<ffffffff8105ab74>] ? process_one_work+0x1a6/0x409
[  240.060940]  [<ffffffff8105b694>] worker_thread+0x275/0x369
[  240.060942]  [<ffffffff8107c63a>] ? complete+0x42/0x4a
[  240.060944]  [<ffffffff8105b41f>] ? cancel_delayed_work_sync+0x15/0x15
[  240.060947]  [<ffffffff81060039>] kthread+0xf6/0xfe
[  240.060950]  [<ffffffff8105ff43>] ? kthread_create_on_node+0x1ac/0x1ac
[  240.060953]  [<ffffffff8189b892>] ret_from_fork+0x42/0x70
[  240.060955]  [<ffffffff8105ff43>] ? kthread_create_on_node+0x1ac/0x1ac
[  240.060957] INFO: lockdep is turned off.
[  240.060966] INFO: task gem_reloc_vs_gp:4986 blocked for more than 120
seconds.

==Reproduce steps==
---------------------------- 
1.  ./gem_reloc_vs_gpu --run-subtest forked-faulting-reloc-thrashing-hang

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are on the CC list for the bug.
You are the assignee for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.freedesktop.org/archives/intel-gfx-bugs/attachments/20150529/9d3817d6/attachment.html>