[Bug 102221] [SKL] GPU HANG on rcs0 on Intel i915 on drm-tip

bugzilla-daemon at freedesktop.org bugzilla-daemon at freedesktop.org
Wed Aug 16 12:27:07 UTC 2017


https://bugs.freedesktop.org/show_bug.cgi?id=102221

--- Comment #4 from Imre Deak <imre.deak at intel.com> ---
(In reply to Chris Wilson from comment #2)
> No, extra drm.debug is not required. Looks like it didn't send the final
> context-switch interrupt.
> 
> Reasonable suspicion laid on

Looks like a known DMC firmware bug, where toggling DC6 enabled state can
corrupt registers backed by DC6 power context. There is an internal bug ticket
opened for this, I'm planning to provide more debug info to the firmware team
and convince them to fix it.

One register that can get corrupted is GEN8_MASTER_IRQ leaving all i915
interrupts disabled, that would also explain the missing ctx switch interrupt.

> 
> Aug 14 18:51:41 precision kernel: [ 7196.525920] DC6 already programmed to
> be enabled.
> Aug 14 18:51:41 precision kernel: [ 7196.525947] ------------[ cut here
> ]------------
> Aug 14 18:51:41 precision kernel: [ 7196.525981] WARNING: CPU: 6 PID: 13635
> at drivers/gpu/drm/i915/intel_runtime_pm.c:606 skl_enable_dc6+0x9f/0xb0
> [i915]
> Aug 14 18:51:41 precision kernel: [ 7196.525981] Modules linked in:
> snd_usb_toneport snd_usb_line6 rfcomm ccm cmac bnep hid_multitouch
> snd_hda_codec_hdmi nls_iso8859_1 dell_rbtn dell_laptop snd_hda_codec_realtek
> snd_hda_codec_generic intel_rapl x86_pkg_temp_thermal intel_powerclamp
> coretemp kvm_intel kvm irqbypass joydev dell_wmi dell_smbios serio_raw
> dcdbas wmi_bmof snd_hda_intel snd_hda_codec snd_hda_core snd_usb_audio
> snd_usbmidi_lib snd_hwdep snd_pcm iwlmvm snd_seq_midi snd_seq_midi_event
> thunderbolt nvmem_core snd_rawmidi mac80211 rtsx_pci_ms memstick snd_seq
> uvcvideo videobuf2_vmalloc snd_seq_device videobuf2_memops snd_timer
> videobuf2_v4l2 input_leds videobuf2_core videodev snd media usblp soundcore
> iwlwifi mei_me btusb mei idma64 btrtl intel_pch_thermal intel_lpss_pci
> processor_thermal_device intel_soc_dts_iosf shpchp ie31200_edac
> Aug 14 18:51:41 precision kernel: [ 7196.526000]  hci_uart btbcm serdev
> btqca int3403_thermal btintel bluetooth ecdh_generic intel_lpss_acpi
> dell_smo8800 intel_lpss int3402_thermal int340x_thermal_zone int3400_thermal
> mac_hid acpi_pad acpi_thermal_rel intel_hid parport_pc ppdev lp parport
> efivarfs autofs4 btrfs xor raid6_pq algif_skcipher af_alg dm_crypt dm_mirror
> dm_region_hash dm_log rtsx_pci_sdmmc mmc_core crct10dif_pclmul crc32_pclmul
> crc32c_intel ghash_clmulni_intel pcbc nouveau i915 aesni_intel aes_x86_64
> crypto_simd glue_helper cryptd firewire_ohci psmouse mxm_wmi ttm
> firewire_core prime_numbers i2c_algo_bit crc_itu_t drm_kms_helper
> syscopyarea sysfillrect nvme sysimgblt fb_sys_fops nvme_core rtsx_pci drm
> i2c_hid wmi pinctrl_sunrisepoint pinctrl_intel [last unloaded: snd_usb_line6]
> Aug 14 18:51:41 precision kernel: [ 7196.526060] CPU: 6 PID: 13635 Comm:
> kworker/u16:1 Tainted: G        W       4.13.0-rc4+ #4
> Aug 14 18:51:41 precision kernel: [ 7196.526061] Hardware name: Dell Inc.
> Precision 5510/08R8KJ, BIOS 1.2.29 07/24/2017
> Aug 14 18:51:41 precision kernel: [ 7196.526078] Workqueue: i915-dp
> i915_digport_work_func [i915]
> Aug 14 18:51:41 precision kernel: [ 7196.526079] task: ffff8a6a6db55d00
> task.stack: ffffb595c72ec000
> Aug 14 18:51:41 precision kernel: [ 7196.526089] RIP:
> 0010:skl_enable_dc6+0x9f/0xb0 [i915]
> Aug 14 18:51:41 precision kernel: [ 7196.526090] RSP: 0018:ffffb595c72efd48
> EFLAGS: 00010286
> Aug 14 18:51:41 precision kernel: [ 7196.526091] RAX: 0000000000000025 RBX:
> ffff8a6b938c8000 RCX: 0000000000000000
> Aug 14 18:51:41 precision kernel: [ 7196.526091] RDX: 0000000000000000 RSI:
> ffff8a6bbdd8cc38 RDI: ffff8a6bbdd8cc38
> Aug 14 18:51:41 precision kernel: [ 7196.526092] RBP: ffffb595c72efd50 R08:
> 00000000000029f0 R09: 0000000000000004
> Aug 14 18:51:41 precision kernel: [ 7196.526092] R10: 0000000000000040 R11:
> 0000000000000001 R12: ffff8a6b938c8000
> Aug 14 18:51:41 precision kernel: [ 7196.526093] R13: ffff8a6b938ccbc0 R14:
> ffffffffc04613f8 R15: 0000000020000000
> Aug 14 18:51:41 precision kernel: [ 7196.526093] FS:  0000000000000000(0000)
> GS:ffff8a6bbdd80000(0000) knlGS:0000000000000000
> Aug 14 18:51:41 precision kernel: [ 7196.526094] CS:  0010 DS: 0000 ES: 0000
> CR0: 0000000080050033
> Aug 14 18:51:41 precision kernel: [ 7196.526094] CR2: 000000000337b010 CR3:
> 000000073a60a000 CR4: 00000000003406e0
> Aug 14 18:51:41 precision kernel: [ 7196.526095] DR0: 0000000000000000 DR1:
> 0000000000000000 DR2: 0000000000000000
> Aug 14 18:51:41 precision kernel: [ 7196.526095] DR3: 0000000000000000 DR6:
> 00000000fffe0ff0 DR7: 0000000000000400
> Aug 14 18:51:41 precision kernel: [ 7196.526096] Call Trace:
> Aug 14 18:51:41 precision kernel: [ 7196.526106] 
> gen9_dc_off_power_well_disable+0x24/0x30 [i915]
> Aug 14 18:51:41 precision kernel: [ 7196.526135] 
> intel_power_well_disable+0x39/0x40 [i915]
> Aug 14 18:51:41 precision kernel: [ 7196.526144] 
> intel_display_power_put+0xad/0x110 [i915]
> Aug 14 18:51:41 precision kernel: [ 7196.526159] 
> intel_dp_hpd_pulse+0x15e/0x300 [i915]
> Aug 14 18:51:41 precision kernel: [ 7196.526172] 
> i915_digport_work_func+0x85/0xf0 [i915]
> Aug 14 18:51:41 precision kernel: [ 7196.526174] 
> process_one_work+0x1d6/0x3d0
> Aug 14 18:51:41 precision kernel: [ 7196.526175]  worker_thread+0x42/0x3e0
> Aug 14 18:51:41 precision kernel: [ 7196.526177]  kthread+0x11f/0x140
> Aug 14 18:51:41 precision kernel: [ 7196.526178]  ?
> trace_event_raw_event_workqueue_execute_start+0xb0/0xb0
> Aug 14 18:51:41 precision kernel: [ 7196.526179]  ?
> kthread_create_on_node+0x60/0x60
> Aug 14 18:51:41 precision kernel: [ 7196.526181]  ret_from_fork+0x22/0x30
> Aug 14 18:51:41 precision kernel: [ 7196.526182] Code: 05 35 1b 13 00 01 e8
> 3d 05 56 f8 0f ff eb 99 80 3d 24 1b 13 00 00 75 a7 48 c7 c7 00 11 46 c0 c6
> 05 14 1b 13 00 01 e8 1d 05 56 f8 <0f> ff eb 90 0f 1f 00 66 2e 0f 1f 84 00 00
> 00 00 00 48 83 bf 40
> Aug 14 18:51:41 precision kernel: [ 7196.526196] ---[ end trace
> db26e1435af3d97b ]---
> Aug 14 18:51:41 precision kernel: [ 7196.526476] [drm:gen9_set_dc_state
> [i915]] *ERROR* DC state mismatch (0x0 -> 0x2)
> Aug 14 18:52:41 precision kernel: [ 7255.928570] [drm:gen9_set_dc_state
> [i915]] *ERROR* DC state mismatch (0x0 -> 0x2)
> Aug 14 18:53:37 precision kernel: [ 7312.530599] [drm:gen9_set_dc_state
> [i915]] *ERROR* DC state mismatch (0x0 -> 0x2)

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are on the CC list for the bug.
You are the assignee for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/intel-gfx-bugs/attachments/20170816/503d017c/attachment.html>


More information about the intel-gfx-bugs mailing list