[Bug 102269] New: BXT: i915: GPU HANG followed by full crash after suspend

bugzilla-daemon at freedesktop.org bugzilla-daemon at freedesktop.org
Thu Aug 17 01:36:58 UTC 2017


https://bugs.freedesktop.org/show_bug.cgi?id=102269

            Bug ID: 102269
           Summary: BXT: i915: GPU HANG followed by full crash after
                    suspend
           Product: DRI
           Version: unspecified
          Hardware: x86-64 (AMD64)
                OS: Linux (All)
            Status: NEW
          Severity: major
          Priority: medium
         Component: DRM/Intel
          Assignee: intel-gfx-bugs at lists.freedesktop.org
          Reporter: fei.yang at intel.com
        QA Contact: intel-gfx-bugs at lists.freedesktop.org
                CC: intel-gfx-bugs at lists.freedesktop.org

On joule compute module, soon after doing
# echo freeze > /sys/power/state
the device will crash after it wakes up from suspend.
The kernel version is 4.9.27-intel-pk-standard
The xf86-video-intel version is xf86-video-intel/2_2.99.917
This is the crash dump:
[ 84.844432] [drm] GPU HANG: ecode 9:0:0xfffffffe, reason: Hang on render ring,
action: reset
[ 84.853874] [drm] GPU hangs can indicate a bug anywhere in the entire gfx
stack, including userspace.
[ 84.864454] [drm] Please file a new bug report on bugs.freedesktop.org against
DRI -> DRM/Intel
[ 84.874444] [drm] drm/i915 developers can then reassign to the right component
if it's not a kernel issue.
[ 84.885309] [drm] The gpu crash dump is required to analyze gpu hangs, so
please always attach it.
[ 84.895397] [drm] GPU crash dump saved to /sys/class/drm/card0/error
[ 84.902889] drm/i915: Resetting chip after gpu hang
[ 84.910547] BUG: unable to handle kernel NULL pointer dereference at
0000000000000070
[ 84.919385] IP: [<ffffffffa022e4c2>] reset_common_ring+0xa2/0x130 [i915]
[ 84.926941] PGD 0 [ 84.928998] 
[ 84.930670] Oops: 0000 1 PREEMPT SMP
[ 84.934971] Modules linked in: intel_ipu4_isys_mod_bxtB0(O) videobuf2_v4l2
videobuf2_core intel_ipu4_psys_mod_bxtB0(O) int
el_ipu4_mmu_bxtB0(O) intel_ipu4_mod_bxtB0(O) iova intel_ipu4_acpi(O)
videobuf2_dma_contig videobuf2_memops videobuf_core dw97
14(O) crlmodule(O) v4l2_common videodev media rfcomm usb_f_mtp usb_f_ecm
u_ether usb_f_acm u_serial libcomposite configfs snd
_soc_wm8998 extcon_arizona snd_soc_arizona arizona_micsupp extcon_core
snd_soc_core snd_compress ac97_bus arizona_ldo1 gpio_a
rizona iptable_nat nf_nat_ipv4 nf_nat bnep iptable_mangle snd_hda_codec_hdmi
mei_spd gpio_keys intel_rapl x86_pkg_temp_therma
l intel_powerclamp coretemp efivars clk_wcove typec_wcove arc4 gpio_wcove
iwlmvm(O) mac80211(O) pwm_lpss_pci pwm_lpss btusb b
trtl btbcm iwlwifi(O) spi_pxa2xx_platform snd_hda_intel cfg80211(O)
snd_hda_codec snd_hda_core compat(O) snd_pcm i915 fdp_i2c
fdp i2c_designware_platform i2c_designware_core nci mei_me snd_timer
processor_thermal_device dwc3_pci nfc mei intel_soc_dts
_iosf at24 bq25890_charger atmel_mxt_ts nvmem_core hci_uart btintel
int3400_thermal acpi_thermal_rel video int3403_thermal in
t340x_thermal_zone soc_button_array nf_conntrack_ipv6 nf_defrag_ipv6
ip6table_filter ip6_tables nf_conntrack_ipv4 nf_defrag_i
pv4 xt_tcpudp xt_conntrack nf_conntrack iptable_filter ip_tables x_tables uio
arizona_i2c 5xx_comms_leds(O)
[ 85.068404] CPU: 0 PID: 472 Comm: kworker/0:2 Tainted: G O
4.9.27-intel-pk-standard #1
[ 85.078561] Hardware name: Intel Corp. 570x DVT2/SDS, BIOS
GTPP1H3A.X64.0143.B30.1706022158 06/02/2017
[ 85.089030] Workqueue: events_long i915_hangcheck_elapsed [i915]
[ 85.095779] task: ffff880179cba340 task.stack: ffffc900007f8000
[ 85.102424] RIP: 0010:[<ffffffffa022e4c2>] [<ffffffffa022e4c2>]
reset_common_ring+0xa2/0x130 [i915]
[ 85.112706] RSP: 0018:ffffc900007fbb30 EFLAGS: 00010246
[ 85.118668] RAX: 0000000000000000 RBX: ffff88016aaa8500 RCX: 0000000080000006
[ 85.126674] RDX: 0000000000003fd8 RSI: ffff880178d06000 RDI: ffff88017a3a0200
[ 85.134681] RBP: ffffc900007fbb48 R08: 0000000000000017 R09: ffffc90010001000
[ 85.142692] R10: 0000000000000000 R11: ffff88017b161800 R12: ffff880179c9a000
[ 85.150704] R13: 0000000000000000 R14: ffffffff819130f0 R15: ffff880179c9a000
[ 85.158718] FS: 0000000000000000(0000) GS:ffff88017fc00000(0000)
knlGS:0000000000000000
[ 85.167805] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 85.174274] CR2: 0000000000000070 CR3: 0000000002e07000 CR4: 00000000003406f0
[ 85.182303] Stack:
[ 85.184555] ffff880178932800 ffff880179ac27d8 ffff88016aaa8500
ffffc900007fbbc0
[ 85.192887] ffffffffa021a9b8 ffff880179ac06e8 0000000000000286
0000000000000286
[ 85.201219] ffffffffa0238a78 ffff880179ac0000 0000000000000000
ffff880179ac0000
[ 85.209551] Call Trace:
[ 85.212311] [<ffffffffa021a9b8>] i915_gem_reset+0x148/0x3b0 [i915]
[ 85.219364] [<ffffffffa0238a78>] ? intel_uncore_forcewake_put+0x48/0x60 [i915]
[ 85.227575] [<ffffffff819130f0>] ? bit_wait_io_timeout+0x70/0x70
[ 85.234425] [<ffffffffa01dd29c>] i915_reset+0xdc/0x170 [i915]
[ 85.240978] [<ffffffffa01e274d>] i915_reset_and_wakeup+0x13d/0x150 [i915]
[ 85.248711] [<ffffffffa01e63b6>] i915_handle_error+0x206/0x220 [i915]
[ 85.256042] [<ffffffff8140878d>] ? scnprintf+0x3d/0x70
[ 85.261923] [<ffffffffa022ca0c>] hangcheck_declare_hang+0xcc/0xe0 [i915]
[ 85.269562] [<ffffffffa022bf64>] ? intel_engine_get_active_head+0xb4/0xe0
[i915]
[ 85.277982] [<ffffffffa022cca9>] i915_hangcheck_elapsed+0x289/0x2b0 [i915]
[ 85.285800] [<ffffffff81094dce>] process_one_work+0x1de/0x4c0
[ 85.292348] [<ffffffff810950f8>] worker_thread+0x48/0x4e0
[ 85.298506] [<ffffffff810950b0>] ? process_one_work+0x4c0/0x4c0
[ 85.305270] [<ffffffff8109a257>] kthread+0xd7/0xf0
[ 85.310745] [<ffffffff8109a180>] ? kthread_park+0x60/0x60
[ 85.316905] [<ffffffff81917652>] ret_from_fork+0x22/0x30
[ 85.322965] Code: 8b 83 80 00 00 00 c7 40 3c ff ff ff ff 48 8b bb 80 00 00 00
e8 80 36 00 00 8b 05 0a 55 0b 00 85 c0 75 71
4d 8b ac 24 58 02 00 00 <49> 8b 45 70 48 39 43 70 74 50 4d 85 ed 74 13 48 c7 c0
a0 81 58 
[ 85.344442] RIP [<ffffffffa022e4c2>] reset_common_ring+0xa2/0x130 [i915]
[ 85.352081] RSP <ffffc900007fbb30>
[ 85.355984] CR2: 0000000000000070
[ 85.373716] --[ end trace d4d4d62e81cbe6bc ]--
[ 85.382625] BUG: unable to handle kernel paging request at ffffffffffffffd8
[ 85.390467] IP: [<ffffffff8109acb1>] kthread_data+0x11/0x20
[ 85.396736] PGD 2e08067 [ 85.399378] PUD 2e0a067 
PMD 0 [ 85.402813] 
[ 85.404484] Oops: 0000 2 PREEMPT SMP
[ 85.408786] Modules linked in: intel_ipu4_isys_mod_bxtB0(O) videobuf2_v4l2
videobuf2_core intel_ipu4_psys_mod_bxtB0(O) int
el_ipu4_mmu_bxtB0(O) intel_ipu4_mod_bxtB0(O) iova intel_ipu4_acpi(O)
videobuf2_dma_contig videobuf2_memops videobuf_core dw97
14(O) crlmodule(O) v4l2_common videodev media rfcomm usb_f_mtp usb_f_ecm
u_ether usb_f_acm u_serial libcomposite configfs snd
_soc_wm8998 extcon_arizona snd_soc_arizona arizona_micsupp extcon_core
snd_soc_core snd_compress ac97_bus arizona_ldo1 gpio_a
rizona iptable_nat nf_nat_ipv4 nf_nat bnep iptable_mangle snd_hda_codec_hdmi
mei_spd gpio_keys intel_rapl x86_pkg_temp_therma
l intel_powerclamp coretemp efivars clk_wcove typec_wcove arc4 gpio_wcove
iwlmvm(O) mac80211(O) pwm_lpss_pci pwm_lpss btusb b
trtl btbcm iwlwifi(O) spi_pxa2xx_platform snd_hda_intel cfg80211(O)
snd_hda_codec snd_hda_core compat(O) snd_pcm i915 fdp_i2c
fdp i2c_designware_platform i2c_designware_core nci mei_me snd_timer
processor_thermal_device dwc3_pci nfc mei intel_soc_dts
_iosf at24 bq25890_charger atmel_mxt_ts nvmem_core hci_uart btintel
int3400_thermal acpi_thermal_rel video int3403_thermal in
t340x_thermal_zone soc_button_array nf_conntrack_ipv6 nf_defrag_ipv6
ip6table_filter ip6_tables nf_conntrack_ipv4 nf_defrag_i
pv4 xt_tcpudp xt_conntrack nf_conntrack iptable_filter ip_tables x_tables uio
arizona_i2c 5xx_comms_leds(O)
[ 85.542261] CPU: 0 PID: 472 Comm: kworker/0:2 Tainted: G D O
4.9.27-intel-pk-standard #1
[ 85.552419] Hardware name: Intel Corp. 570x DVT2/SDS, BIOS
GTPP1H3A.X64.0143.B30.1706022158 06/02/2017
[ 85.562877] task: ffff880179cba340 task.stack: ffffc900007f8000
[ 85.569521] RIP: 0010:[<ffffffff8109acb1>] [<ffffffff8109acb1>]
kthread_data+0x11/0x20
[ 85.578514] RSP: 0018:ffffc900007fbe68 EFLAGS: 00010002
[ 85.584474] RAX: 0000000000000000 RBX: ffff88017fc17500 RCX: 0000000000000000
[ 85.592483] RDX: ffff88017b005000 RSI: ffff880179cba3c0 RDI: ffff880179cba340
[ 85.600492] RBP: ffffc900007fbe70 R08: 0000000000000000 R09: 0000000000000000
[ 85.608501] R10: 0000000000000000 R11: ffff880179cba3c0 R12: ffff880179cba340
[ 85.616501] R13: 0000000000000000 R14: ffff880179cba808 R15: 0000000000017500
[ 85.624511] FS: 0000000000000000(0000) GS:ffff88017fc00000(0000)
knlGS:0000000000000000
[ 85.633595] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 85.640044] CR2: 0000000000000028 CR3: 000000017abd3000 CR4: 00000000003406f0
[ 85.648054] Stack:
[ 85.650306] ffffffff8109602e ffffc900007fbec8 ffffffff819125be
ffffc900007fbee0
[ 85.658651] ffffffff8107ead6 0000000000000000 0000000000000000
ffff880179cba340
[ 85.666993] ffffc900007fbf10 ffffc900007fbb20 0000000000000000
0000000000000009
[ 85.675334] Call Trace:
[ 85.678075] [<ffffffff8109602e>] ? wq_worker_sleeping+0xe/0x80
[ 85.684721] [<ffffffff819125be>] __schedule+0x35e/0x5a0
[ 85.690683] [<ffffffff8107ead6>] ? release_task+0x2d6/0x3c0
[ 85.697035] [<ffffffff810a59e8>] do_task_dead+0x38/0x40
[ 85.702996] [<ffffffff8108033f>] do_exit+0x79f/0xb00
[ 85.708657] [<ffffffff819187d7>] rewind_stack_do_exit+0x17/0x20
[ 85.715400] Code: 80 04 00 00 48 c7 c7 a8 7d bc 81 e8 7a 0f fe ff eb ca 0f 1f
84 00 00 00 00 00 0f 1f 44 00 00 48 8b 87 68
04 00 00 55 48 89 e5 5d <48> 8b 40 d8 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44
00 00 55 
[ 85.737001] RIP [<ffffffff8109acb1>] kthread_data+0x11/0x20
[ 85.743361] RSP <ffffc900007fbe68>
[ 85.747292] CR2: ffffffffffffffd8
[ 85.751010] --[ end trace d4d4d62e81cbe6bd ]--
[ 85.756160] Fixing recursive fault but reboot is needed!
[ 85.762094] BUG: scheduling while atomic: kworker/0:2/472/0x00000003
[ 85.769194] Modules linked in: intel_ipu4_isys_mod_bxtB0(O) videobuf2_v4l2
videobuf2_core intel_ipu4_psys_mod_bxtB0(O) int
el_ipu4_mmu_bxtB0(O) intel_ipu4_mod_bxtB0(O) iova intel_ipu4_acpi(O)
videobuf2_dma_contig videobuf2_memops videobuf_core dw97
14(O) crlmodule(O) v4l2_common videodev media rfcomm usb_f_mtp usb_f_ecm
u_ether usb_f_acm u_serial libcomposite configfs snd
_soc_wm8998 extcon_arizona snd_soc_arizona arizona_micsupp extcon_core
snd_soc_core snd_compress ac97_bus arizona_ldo1 gpio_a
rizona iptable_nat nf_nat_ipv4 nf_nat bnep iptable_mangle snd_hda_codec_hdmi
mei_spd gpio_keys intel_rapl x86_pkg_temp_therma
l intel_powerclamp coretemp efivars clk_wcove typec_wcove arc4 gpio_wcove
iwlmvm(O) mac80211(O) pwm_lpss_pci pwm_lpss btusb b
trtl btbcm iwlwifi(O) spi_pxa2xx_platform snd_hda_intel cfg80211(O)
snd_hda_codec snd_hda_core compat(O) snd_pcm i915 fdp_i2c
fdp i2c_designware_platform i2c_designware_core nci mei_me snd_timer
processor_thermal_device dwc3_pci nfc mei intel_soc_dts
_iosf at24 bq25890_charger atmel_mxt_ts nvmem_core hci_uart btintel
int3400_thermal acpi_thermal_rel video int3403_thermal in
t340x_thermal_zone soc_button_array nf_conntrack_ipv6 nf_defrag_ipv6
ip6table_filter ip6_tables nf_conntrack_ipv4 nf_defrag_i
pv4 xt_tcpudp xt_conntrack nf_conntrack iptable_filter ip_tables x_tables uio
arizona_i2c 5xx_comms_leds(O)
[ 85.901747] CPU: 0 PID: 472 Comm: kworker/0:2 Tainted: G D O
4.9.27-intel-pk-standard #1
[ 85.911860] Hardware name: Intel Corp. 570x DVT2/SDS, BIOS
GTPP1H3A.X64.0143.B30.1706022158 06/02/2017
[ 85.922271] ffffc900007fbe60 ffffffff813fc6da ffff88017fc17500
ffff880179cba340
[ 85.930552] ffffc900007fbe70 ffffffff810a0cbf ffffc900007fbec8
ffffffff81912654
[ 85.938830] ffffc900007fbee0 ffffffff81149e01 0000000000000008
ffffc900007fbef0
[ 85.947120] Call Trace:
[ 85.949851] [<ffffffff813fc6da>] dump_stack+0x4d/0x63
[ 85.955592] [<ffffffff810a0cbf>] __schedule_bug+0x4f/0x70
[ 85.961721] [<ffffffff81912654>] __schedule+0x3f4/0x5a0
[ 85.967657] [<ffffffff81149e01>] ? printk+0x48/0x50
[ 85.973203] [<ffffffff8191283d>] schedule+0x3d/0x90
[ 85.978747] [<ffffffff810804da>] do_exit+0x93a/0xb00
[ 85.984388] [<ffffffff819187d7>] rewind_stack_do_exit+0x17/0x20

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/intel-gfx-bugs/attachments/20170817/1d1bea00/attachment-0001.html>


More information about the intel-gfx-bugs mailing list