[Bug 92545] New: [BSW] GPU Hang leads to sporadic kernel crashes
bugzilla-daemon at freedesktop.org
bugzilla-daemon at freedesktop.org
Mon Oct 19 12:27:10 PDT 2015
https://bugs.freedesktop.org/show_bug.cgi?id=92545
Bug ID: 92545
Summary: [BSW] GPU Hang leads to sporadic kernel crashes
Product: DRI
Version: XOrg git
Hardware: x86-64 (AMD64)
OS: other
Status: NEW
Severity: major
Priority: medium
Component: DRM/Intel
Assignee: intel-gfx-bugs at lists.freedesktop.org
Reporter: dhinakaran.pandiyan at intel.com
QA Contact: intel-gfx-bugs at lists.freedesktop.org
CC: intel-gfx-bugs at lists.freedesktop.org
Kernel crashes after a GPU reset. The GPU hang is frequent and happens during
boot time. However, the GPU hang occasionally results in a kernel crash. This
has been observed on Chrome OS with a 3.18 kernel that has i915 backports.
I believe that the NULL pointer access happens at
I915_WRITE(DSPSURF(intel_crtc->plane), intel_crtc->unpin_work->gtt_offset); in
intel_display.c:ilk_do_mmio_flip
If we assume an ongoing reset, then the call sequence
intel_finish_reset -> intel_complete_page_flips -> intel_finish_page_flip_plane
-> do_intel_finish_page_flip -> page_flip_completed
might set intel_crtc->unpin_work = NULL.
We need some help to debug this crash.
<6>[ 6.744129] [drm] stuck on render ring
<6>[ 6.766343] [drm] GPU HANG: ecode 8:0:0x2efe5dbc, reason: Ring hung,
action: reset
<6>[ 6.766356] [drm] GPU hangs can indicate a bug anywhere in the entire gfx
stack, including userspace.
<6>[ 6.766367] [drm] Please file a _new_ bug report on bugs.freedesktop.org
against DRI -> DRM/Intel
<6>[ 6.766378] [drm] drm/i915 developers can then reassign to the right
component if it's not a kernel issue.
<6>[ 6.766389] [drm] The gpu crash dump is required to analyze gpu hangs, so
please always attach it.
<6>[ 6.766400] [drm] GPU crash dump saved to /sys/class/drm/card0/error
<5>[ 6.769207] drm/i915: Resetting chip after gpu hang
<6>[ 12.739947] [drm] stuck on render ring
<6>[ 12.765654] [drm] GPU HANG: ecode 8:0:0x86dffffd, in chrome [3652],
reason: Ring hung, action: reset
<4>[ 12.765733] ------------[ cut here ]------------
<4>[ 12.765764] WARNING: CPU: 2 PID: 41 at
/mnt/host/source/src/third_party/kernel/v3.18/drivers/gpu/drm/i915/intel_display.c:11277
intel_mmio_flip_work_func+0x6d/0x315()
<4>[ 12.765787] WARN_ON(__i915_wait_request(mmio_flip->req,
mmio_flip->crtc->reset_counter, false, NULL, &mmio_flip->i915->rps.mmioflips))
<4>[ 12.765805] Modules linked in: nf_conntrack_ipv6 nf_defrag_ipv6
cros_ec_sensors ip6table_filter cros_ec_sensors_core
industrialio_triggered_buffer kfifo_buf ip6_tables iio_trig_sysfs industrialio
iwlmvm iwl7000_mac80211 iwlwifi cfg80211 btusb btbcm btintel bluetooth smsc95xx
usbnet mii uvcvideo videobuf2_vmalloc videobuf2_memops videobuf2_core joydev
ppp_async ppp_generic slhc tun
<4>[ 12.765930] CPU: 2 PID: 41 Comm: kworker/2:1 Not tainted
3.18.0-06623-g902cb99 #1
<4>[ 12.765944] Hardware name: GOOGLE Cyan, BIOS
Google_Cyan.7287.57.2015_09_30_1147 09/30/2015
<4>[ 12.765962] Workqueue: events intel_mmio_flip_work_func
<4>[ 12.765974] 0000000000000000 000000004607d413 ffff88017a9bbcc8
ffffffff8d5f3d15
<4>[ 12.765996] 0000000000000000 ffff88017a9bbd20 ffff88017a9bbd08
ffffffff8d03dfd9
<4>[ 12.766016] ffff88017a9bbcd8 ffffffff8d345524 ffff88017a97f000
ffff880072f2da00
<4>[ 12.766037] Call Trace:
<4>[ 12.766054] [<ffffffff8d5f3d15>] ? dump_stack+0x46/0x58
<4>[ 12.766070] [<ffffffff8d03dfd9>] ? warn_slowpath_common+0x81/0x9b
<4>[ 12.766085] [<ffffffff8d345524>] ? intel_mmio_flip_work_func+0x6d/0x315
<4>[ 12.766100] [<ffffffff8d03e048>] ? warn_slowpath_fmt+0x55/0x6b
<4>[ 12.766115] [<ffffffff8d345524>] ? intel_mmio_flip_work_func+0x6d/0x315
<4>[ 12.766133] [<ffffffff8d05c849>] ? finish_task_switch+0x5b/0xba
<4>[ 12.766149] [<ffffffff8d051a1b>] ? process_one_work+0x175/0x2ab
<4>[ 12.766163] [<ffffffff8d052c95>] ? worker_thread+0x1fb/0x2ce
<4>[ 12.766178] [<ffffffff8d052a9a>] ? rescuer_thread+0x2d7/0x2d7
<4>[ 12.766192] [<ffffffff8d056863>] ? kthread+0x10e/0x116
<4>[ 12.766207] [<ffffffff8d056755>] ? kthread_stop+0xc0/0xc0
<4>[ 12.766222] [<ffffffff8d5f8bac>] ? ret_from_fork+0x7c/0xb0
<4>[ 12.766237] [<ffffffff8d056755>] ? kthread_stop+0xc0/0xc0
<4>[ 12.766249] ---[ end trace 8d614c29c562a829 ]---
<5>[ 12.767790] drm/i915: Resetting chip after gpu hang
<6>[ 18.740012] [drm] stuck on render ring
<6>[ 18.760304] [drm] GPU HANG: ecode 8:0:0x86dffffd, in chrome [3652],
reason: Ring hung, action: reset
<4>[ 18.760635] ------------[ cut here ]------------
<4>[ 18.760665] WARNING: CPU: 0 PID: 1099 at
/mnt/host/source/src/third_party/kernel/v3.18/drivers/gpu/drm/i915/intel_display.c:11277
intel_mmio_flip_work_func+0x6d/0x315()
<4>[ 18.760688] WARN_ON(__i915_wait_request(mmio_flip->req,
mmio_flip->crtc->reset_counter, false, NULL, &mmio_flip->i915->rps.mmioflips))
<4>[ 18.760706] Modules linked in: nf_conntrack_ipv6 nf_defrag_ipv6
cros_ec_sensors ip6table_filter cros_ec_sensors_core
industrialio_triggered_buffer kfifo_buf ip6_tables iio_trig_sysfs industrialio
iwlmvm iwl7000_mac80211 iwlwifi cfg80211 btusb btbcm btintel bluetooth smsc95xx
usbnet mii uvcvideo videobuf2_vmalloc videobuf2_memops videobuf2_core joydev
ppp_async ppp_generic slhc tun
<4>[ 18.760823] CPU: 0 PID: 1099 Comm: kworker/0:2 Tainted: G W
3.18.0-06623-g902cb99 #1
<4>[ 18.760837] Hardware name: GOOGLE Cyan, BIOS
Google_Cyan.7287.57.2015_09_30_1147 09/30/2015
<4>[ 18.760854] Workqueue: events intel_mmio_flip_work_func
<4>[ 18.760866] 0000000000000000 00000000e8ee967d ffff8801760b3cc8
ffffffff8d5f3d15
<4>[ 18.760886] 0000000000000000 ffff8801760b3d20 ffff8801760b3d08
ffffffff8d03dfd9
<4>[ 18.760905] ffff8801760b3cd8 ffffffff8d345524 ffff8801799ceb40
ffff8801798a50c0
<4>[ 18.760925] Call Trace:
<4>[ 18.760940] [<ffffffff8d5f3d15>] ? dump_stack+0x46/0x58
<4>[ 18.760955] [<ffffffff8d03dfd9>] ? warn_slowpath_common+0x81/0x9b
<4>[ 18.760969] [<ffffffff8d345524>] ? intel_mmio_flip_work_func+0x6d/0x315
<4>[ 18.760983] [<ffffffff8d03e048>] ? warn_slowpath_fmt+0x55/0x6b
<4>[ 18.760997] [<ffffffff8d345524>] ? intel_mmio_flip_work_func+0x6d/0x315
<4>[ 18.761014] [<ffffffff8d05c849>] ? finish_task_switch+0x5b/0xba
<4>[ 18.761028] [<ffffffff8d051a1b>] ? process_one_work+0x175/0x2ab
<4>[ 18.761042] [<ffffffff8d052c95>] ? worker_thread+0x1fb/0x2ce
<4>[ 18.761055] [<ffffffff8d052a9a>] ? rescuer_thread+0x2d7/0x2d7
<4>[ 18.761069] [<ffffffff8d056863>] ? kthread+0x10e/0x116
<4>[ 18.761083] [<ffffffff8d056755>] ? kthread_stop+0xc0/0xc0
<4>[ 18.761096] [<ffffffff8d5f8bac>] ? ret_from_fork+0x7c/0xb0
<4>[ 18.761110] [<ffffffff8d056755>] ? kthread_stop+0xc0/0xc0
<4>[ 18.761121] ---[ end trace 8d614c29c562a82a ]---
<5>[ 18.763443] drm/i915: Resetting chip after gpu hang
<1>[ 18.769490] BUG: unable to handle kernel NULL pointer dereference at
0000000000000048
<1>[ 18.769515] IP: [<ffffffff8d345716>]
intel_mmio_flip_work_func+0x25f/0x315
<4>[ 18.769536] PGD 0
<4>[ 18.769544] Oops: 0000 [#1] SMP
<0>[ 18.773130] gsmi: Log Shutdown Reason 0x03
<4>[ 18.773140] Modules linked in: nf_conntrack_ipv6 nf_defrag_ipv6
cros_ec_sensors ip6table_filter cros_ec_sensors_core
industrialio_triggered_buffer kfifo_buf ip6_tables iio_trig_sysfs industrialio
iwlmvm iwl7000_mac80211 iwlwifi cfg80211 btusb btbcm btintel bluetooth smsc95xx
usbnet mii uvcvideo videobuf2_vmalloc videobuf2_memops videobuf2_core joydev
ppp_async ppp_generic slhc tun
<4>[ 18.773241] CPU: 0 PID: 1099 Comm: kworker/0:2 Tainted: G W
3.18.0-06623-g902cb99 #1
<4>[ 18.773255] Hardware name: GOOGLE Cyan, BIOS
Google_Cyan.7287.57.2015_09_30_1147 09/30/2015
<4>[ 18.773272] Workqueue: events intel_mmio_flip_work_func
<4>[ 18.773283] task: ffff880179bfea80 ti: ffff8801760b0000 task.ti:
ffff8801760b0000
<4>[ 18.773296] RIP: 0010:[<ffffffff8d345716>] [<ffffffff8d345716>]
intel_mmio_flip_work_func+0x25f/0x315
<4>[ 18.773314] RSP: 0018:ffff8801760b3d88 EFLAGS: 00010096
<4>[ 18.773324] RAX: 0000000000000000 RBX: ffff88017b2b7000 RCX:
0000000000180000
<4>[ 18.773337] RDX: 00000000001e1180 RSI: 0000000000000046 RDI:
ffff88017a080000
<4>[ 18.773349] RBP: ffff8801760b3de8 R08: 0000000000000001 R09:
ffff88017b2b7000
<4>[ 18.773361] R10: 0000000000000000 R11: 000000000000b910 R12:
ffff88017a080000
<4>[ 18.773373] R13: 00000000001f0180 R14: ffff8801798a50c0 R15:
ffff8801741c5680
<4>[ 18.773386] FS: 0000000000000000(0000) GS:ffff88017fc00000(0000)
knlGS:0000000000000000
<4>[ 18.773399] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
<4>[ 18.773410] CR2: 0000000000000048 CR3: 0000000077e06000 CR4:
00000000001007f0
<4>[ 18.773422] Stack:
<4>[ 18.773427] ffff8801760b3db8 ffffffff8d05c849 ffff88017a99c000
ffff880078ac0b40
<4>[ 18.773446] 000003dc77f04c10 00000000e8ee967d ffff8801760b3e28
ffff8801799ceb40
<4>[ 18.773463] ffff8801798a50c0 ffff88017fc11780 0000000000000000
ffff88017fc15b00
<4>[ 18.773481] Call Trace:
<4>[ 18.773495] [<ffffffff8d05c849>] ? finish_task_switch+0x5b/0xba
<4>[ 18.773510] [<ffffffff8d051a1b>] process_one_work+0x175/0x2ab
<4>[ 18.773523] [<ffffffff8d052c95>] worker_thread+0x1fb/0x2ce
<4>[ 18.773535] [<ffffffff8d052a9a>] ? rescuer_thread+0x2d7/0x2d7
<4>[ 18.773548] [<ffffffff8d056863>] kthread+0x10e/0x116
<4>[ 18.773561] [<ffffffff8d056755>] ? kthread_stop+0xc0/0xc0
<4>[ 18.773575] [<ffffffff8d5f8bac>] ret_from_fork+0x7c/0xb0
<4>[ 18.773587] [<ffffffff8d056755>] ? kthread_stop+0xc0/0xc0
<4>[ 18.773597] Code: 00 c0 74 05 80 cc 04 89 c2 b9 01 00 00 00 4c 89 ee 4c
89 e7 41 ff 94 24 d8 00 00 00 48 8b 83 48 07 00 00 41 8b 4c 24 20 4c 89 e7 <8b>
50 48 8b 83 24 04 00 00 41 8b 44 84 30 41 2b 44 24 30 8d b4
<1>[ 18.773724] RIP [<ffffffff8d345716>]
intel_mmio_flip_work_func+0x25f/0x315
<4>[ 18.773739] RSP <ffff8801760b3d88>
<4>[ 18.773747] CR2: 0000000000000048
<4>[ 18.773756] ---[ end trace 8d614c29c562a82b ]---
<0>[ 18.781967] Kernel panic - not syncing: Fatal exception
<0>[ 18.782089] Kernel Offset: 0xc000000 from 0xffffffff81000000 (relocation
range: 0xffffffff80000000-0xffffffffbfffffff)
<0>[ 18.782293] gsmi: Log Shutdown Reason 0x02
--
You are receiving this mail because:
You are the QA Contact for the bug.
You are on the CC list for the bug.
You are the assignee for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.freedesktop.org/archives/intel-gfx-bugs/attachments/20151019/c7c9ba5d/attachment.html>
More information about the intel-gfx-bugs
mailing list