[Bug 100232] New: [BAT] IGT gem_exec_parallel hangs half of the time on BDW+ testhosts

bugzilla-daemon at freedesktop.org bugzilla-daemon at freedesktop.org
Thu Mar 16 15:48:19 UTC 2017


https://bugs.freedesktop.org/show_bug.cgi?id=100232

            Bug ID: 100232
           Summary: [BAT] IGT gem_exec_parallel hangs half of the time on
                    BDW+ testhosts
           Product: DRI
           Version: DRI git
          Hardware: Other
                OS: All
            Status: NEW
          Severity: normal
          Priority: medium
         Component: DRM/Intel
          Assignee: intel-gfx-bugs at lists.freedesktop.org
          Reporter: tomi.p.sarvela at intel.com
        QA Contact: intel-gfx-bugs at lists.freedesktop.org
                CC: intel-gfx-bugs at lists.freedesktop.org
     i915 platform: BDW, BXT, KBL, SKL
     i915 features: GEM/execlists

One of the subtests in gem_exec_parallel often hangs the host. Below is dump
from SKL6700K on Z170 MB, hanged hard on igt at gem_exec_parallel@render-fds after
running tests/intel-ci/fast-feedback.testlist

CI_DRM_2352 is drm-tip, todays build. For details
https://intel-gfx-ci.01.org/CI/

[  947.215802] general protection fault: 0000 [#1] PREEMPT SMP
[  947.221439] Modules linked in: snd_hda_intel i915 vgem snd_hda_codec_hdmi
snd_hda_codec_realtek x
86_pkg_temp_thermal snd_hda_codec_generic intel_powerclamp coretemp
crct10dif_pclmul crc32_pclmul gh
ash_clmulni_intel snd_hda_codec snd_hwdep snd_hda_core snd_pcm mei_me mei
e1000e igb ptp pps_core pr
ime_numbers pinctrl_sunrisepoint pinctrl_intel i2c_hid [last unloaded: i915]
[  947.254918] CPU: 6 PID: 47 Comm: ksoftirqd/6 Tainted: G     U         
4.11.0-rc2-CI-CI_DRM_2352+
 #1
[  947.264181] Hardware name: Gigabyte Technology Co., Ltd.
Z170X-UD5/Z170X-UD5-CF, BIOS F21 01/06/2
017
[  947.273499] task: ffff88042bdaa7c0 task.stack: ffffc900001fc000
[  947.279489] RIP: 0010:notifier_call_chain+0x59/0xa0
[  947.284426] RSP: 0018:ffffc900001ffd38 EFLAGS: 00010286
[  947.289698] RAX: 0000000000000001 RBX: 00000000ffffffff RCX:
00000000ffffffff
[  947.296917] RDX: ffff8803bf65d5c0 RSI: 0000000000000001 RDI:
ffff88041d05e4c8
[  947.304249] RBP: ffffc900001ffd70 R08: 0000000000000000 R09:
643e07b800000000
[  947.311544] R10: 0000000000000000 R11: ffff88042bdaa7c0 R12:
0000000000000000
[  947.318833] R13: 0000000000000000 R14: 00000000ffffffff R15:
6b6b6b6b6b6b6b6b
[  947.326070] FS:  0000000000000000(0000) GS:ffff88043ed80000(0000)
knlGS:0000000000000000
[  947.334408] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  947.340242] CR2: 00007f83d8000010 CR3: 0000000429021000 CR4:
00000000003406e0
[  947.347512] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
0000000000000000
[  947.354801] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:
0000000000000400
[  947.362080] Call Trace:
[  947.364582]  __atomic_notifier_call_chain+0x73/0x110
[  947.369709]  ? unregister_die_notifier+0x20/0x20
[  947.374406]  atomic_notifier_call_chain+0x11/0x20
[  947.379276]  intel_lrc_irq_handler+0x191/0x490 [i915]
[  947.384458]  tasklet_hi_action+0xf0/0x110
[  947.388611]  __do_softirq+0x116/0x4c0
[  947.392321]  run_ksoftirqd+0x22/0x50
[  947.395960]  smpboot_thread_fn+0x180/0x280
[  947.400129]  kthread+0x107/0x140
[  947.403431]  ? sort_range+0x20/0x20
[  947.407002]  ? kthread_create_on_node+0x40/0x40
[  947.411621]  ret_from_fork+0x2e/0x40
[  947.415233] Code: 4c 89 ff 41 ff 17 4d 85 e4 41 89 c5 74 05 41 83 04 24 01
41 f7 c5 00 80 00 00 7
5 39 83 eb 01 4d 89 f7 4d 85 ff 74 2e 85 db 74 2a <49> 8b 3f 4d 8b 77 08 e8 cb
ca ff ff 85 c0 75 bd 
48 c7 c2 04 f1 
[  947.434475] RIP: notifier_call_chain+0x59/0xa0 RSP: ffffc900001ffd38
[  947.440931] ---[ end trace e6564010da93ee3e ]---
[  947.608936] Kernel panic - not syncing: Fatal exception in interrupt
[  947.615465] Kernel Offset: disabled
[  947.791838] ---[ end Kernel panic - not syncing: Fatal exception in
interrupt
[  947.799101] ------------[ cut here ]------------
[  947.803805] WARNING: CPU: 6 PID: 47 at arch/x86/kernel/smp.c:127
native_smp_send_reschedule+0x3a/
0x40
[  947.813181] Modules linked in: snd_hda_intel i915 vgem snd_hda_codec_hdmi
snd_hda_codec_realtek x
86_pkg_temp_thermal snd_hda_codec_generic intel_powerclamp coretemp
crct10dif_pclmul crc32_pclmul gh
ash_clmulni_intel snd_hda_codec snd_hwdep snd_hda_core snd_pcm mei_me mei
e1000e igb ptp pps_core pr
ime_numbers pinctrl_sunrisepoint pinctrl_intel i2c_hid [last unloaded: i915]
[  947.846626] CPU: 6 PID: 47 Comm: ksoftirqd/6 Tainted: G     UD        
4.11.0-rc2-CI-CI_DRM_2352+
 #1
[  947.855898] Hardware name: Gigabyte Technology Co., Ltd.
Z170X-UD5/Z170X-UD5-CF, BIOS F21 01/06/2
017
[  947.865208] Call Trace:
[  947.867711]  <IRQ>
[  947.869768]  dump_stack+0x67/0x92
[  947.873129]  __warn+0xc6/0xe0
[  947.876145]  warn_slowpath_null+0x18/0x20
[  947.880218]  native_smp_send_reschedule+0x3a/0x40
[  947.885019]  trigger_load_balance+0x2cd/0x580
[  947.889448]  ? trigger_load_balance+0x6f/0x580
[  947.893956]  scheduler_tick+0x97/0xc0
[  947.897673]  ? tick_sched_handle.isra.7+0x40/0x40
[  947.902458]  update_process_times+0x42/0x50
[  947.906721]  tick_sched_handle.isra.7+0x1c/0x40
[  947.911356]  tick_sched_timer+0x3d/0x70
[  947.915249]  __hrtimer_run_queues+0xf3/0x530
[  947.919590]  hrtimer_interrupt+0xb9/0x210
[  947.923655]  local_apic_timer_interrupt+0x31/0x50
[  947.928449]  smp_apic_timer_interrupt+0x33/0x50
[  947.933084]  apic_timer_interrupt+0x90/0xa0
[  947.937330] RIP: 0010:panic+0x1c7/0x205
[  947.941231] RSP: 0018:ffffc900001ffb90 EFLAGS: 00000246 ORIG_RAX:
ffffffffffffff10
[  947.948909] RAX: 0000000000000041 RBX: 0000000000000000 RCX:
0000000000000000
[  947.956191] RDX: 0000000000000101 RSI: ffffffff81c6e65d RDI:
ffffffff8117ef23
[  947.963410] RBP: ffffc900001ffc00 R08: 0000000000000001 R09:
0000000000000000
[  947.970664] R10: 0000000000000000 R11: 0000000000000000 R12:
0000000000000000
[  947.977909] R13: 0000000000000000 R14: 0000000000000000 R15:
6b6b6b6b6b6b6b6b
[  947.985170]  </IRQ>
[  947.987320]  ? panic+0x1c4/0x205
[  947.990604]  ? kmsg_dump+0x11f/0x1c0
[  947.994237]  oops_end+0x78/0x90
[  947.997435]  die+0x46/0x60
[  948.000172]  do_general_protection+0xe0/0x1a0
[  948.004610]  general_protection+0x22/0x30
[  948.008683] RIP: 0010:notifier_call_chain+0x59/0xa0
[  948.013650] RSP: 0018:ffffc900001ffd38 EFLAGS: 00010286
[  948.018963] RAX: 0000000000000001 RBX: 00000000ffffffff RCX:
00000000ffffffff
[  948.026224] RDX: ffff8803bf65d5c0 RSI: 0000000000000001 RDI:
ffff88041d05e4c8
[  948.033487] RBP: ffffc900001ffd70 R08: 0000000000000000 R09:
643e07b800000000
[  948.040759] R10: 0000000000000000 R11: ffff88042bdaa7c0 R12:
0000000000000000
[  948.048020] R13: 0000000000000000 R14: 00000000ffffffff R15:
6b6b6b6b6b6b6b6b
[  948.055277]  __atomic_notifier_call_chain+0x73/0x110
[  948.060328]  ? unregister_die_notifier+0x20/0x20
[  948.065034]  atomic_notifier_call_chain+0x11/0x20
[  948.069835]  intel_lrc_irq_handler+0x191/0x490 [i915]
[  948.074960]  tasklet_hi_action+0xf0/0x110
[  948.079023]  __do_softirq+0x116/0x4c0
[  948.082766]  run_ksoftirqd+0x22/0x50
[  948.086406]  smpboot_thread_fn+0x180/0x280
[  948.090593]  kthread+0x107/0x140
[  948.093867]  ? sort_range+0x20/0x20
[  948.097421]  ? kthread_create_on_node+0x40/0x40
[  948.102041]  ret_from_fork+0x2e/0x40
[  948.105681] ---[ end trace e6564010da93ee3f ]---

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are on the CC list for the bug.
You are the QA Contact for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/intel-gfx-bugs/attachments/20170316/d1c80b52/attachment-0001.html>


More information about the intel-gfx-bugs mailing list