[Bug 101237] New: [SKL] GPU HANG: ecode 9:0:0x85dffffb, reason: Hang on rcs, action: reset

bugzilla-daemon at freedesktop.org bugzilla-daemon at freedesktop.org
Tue May 30 18:27:29 UTC 2017


https://bugs.freedesktop.org/show_bug.cgi?id=101237

            Bug ID: 101237
           Summary: [SKL] GPU HANG: ecode 9:0:0x85dffffb, reason: Hang on
                    rcs, action: reset
           Product: DRI
           Version: XOrg git
          Hardware: x86-64 (AMD64)
                OS: Linux (All)
            Status: NEW
          Severity: major
          Priority: medium
         Component: DRM/Intel
          Assignee: intel-gfx-bugs at lists.freedesktop.org
          Reporter: mbroemme at libmpq.org
        QA Contact: intel-gfx-bugs at lists.freedesktop.org
                CC: intel-gfx-bugs at lists.freedesktop.org

Created attachment 131576
  --> https://bugs.freedesktop.org/attachment.cgi?id=131576&action=edit
i915 crash dump

On Linux 4.12-rc3 with GVT-g enabled, MDEV device attached to a VM and running
an application using VA-API to watch H264 video I got the following GPU hang:

[71153.016783] [drm] GPU HANG: ecode 9:0:0x85dffffb, reason: Hang on rcs,
action: reset
[71153.016784] [drm] GPU hangs can indicate a bug anywhere in the entire gfx
stack, including userspace.
[71153.016784] [drm] Please file a _new_ bug report on bugs.freedesktop.org
against DRI -> DRM/Intel
[71153.016785] [drm] drm/i915 developers can then reassign to the right
component if it's not a kernel issue.
[71153.016785] [drm] The gpu crash dump is required to analyze gpu hangs, so
please always attach it.
[71153.016786] [drm] GPU crash dump saved to /sys/class/drm/card0/error
[71154.896616] asynchronous wait on fence i915:systemd-logind[482]/0:de4af
timed out
[71154.896625] asynchronous wait on fence i915:systemd-logind[482]/0:de4af
timed out
[71154.903792] drm/i915: Resetting chip after gpu hang
[71154.904718] [drm] RC6 on
[71156.927851] WARN_ON_ONCE(!(offset >= 0 && offset <
gvt->device_info.mmio_size))
[71156.927880] ------------[ cut here ]------------
[71156.927897] WARNING: CPU: 3 PID: 1199 at drivers/gpu/drm/i915/gvt/mmio.c:293
intel_vgpu_emulate_mmio_write+0x606/0x630 [i915]
[71156.927897] Modules linked in: vhost_net vhost tap tun ebtable_filter
ebtables ip6table_filter ip6_tables md4 nls_utf8 cifs dns_resolver fscache ctr
ccm bonding bridge stp llc ipheth iptable_filter ipt_MASQUERADE
nf_nat_masquerade_ipv4 iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4
nf_nat nf_conntrack joydev iTCO_wdt iTCO_vendor_support arc4 snd_hda_codec_hdmi
snd_hda_codec_realtek snd_hda_codec_generic iwlmvm mac80211 ext4 jbd2 fscrypto
mbcache snd_soc_skl snd_soc_skl_ipc iwlwifi snd_soc_sst_ipc snd_soc_sst_dsp
snd_hda_ext_core snd_soc_sst_match snd_soc_core snd_compress rtsx_pci_ms
intel_rapl snd_pcm_dmaengine cfg80211 x86_pkg_temp_thermal memstick
intel_powerclamp ac97_bus coretemp kvm_intel intel_cstate intel_rapl_perf
pcspkr psmouse mousedev evdev input_leds mac_hid e1000e i2c_i801
[71156.927925]  ptp pps_core snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep
snd_pcm shpchp snd_timer mei_me mei intel_pch_thermal uvcvideo btusb btrtl
videobuf2_vmalloc btbcm videobuf2_memops videobuf2_v4l2 btintel videobuf2_core
bluetooth videodev media ecdh_generic crc16 kvmgt vfio_mdev mdev
vfio_iommu_type1 vfio kvm irqbypass i915 cdc_ether option usbnet usb_wwan
usbserial mii thinkpad_acpi nvram wmi snd drm_kms_helper soundcore rfkill
led_class battery drm ac intel_gtt syscopyarea sysfillrect sysimgblt
fb_sys_fops i2c_algo_bit video button thermal tpm_tis tpm_tis_core tpm
sch_fq_codel ip_tables x_tables xfs libcrc32c crc32c_generic algif_skcipher
af_alg hid_generic usbhid hid dm_crypt dm_mod dax sd_mod rtsx_pci_sdmmc
mmc_core serio_raw atkbd libps2 crct10dif_pclmul crc32_pclmul crc32c_intel
[71156.927958]  ghash_clmulni_intel pcbc aesni_intel aes_x86_64 crypto_simd
glue_helper cryptd ahci libahci xhci_pci libata xhci_hcd rtsx_pci scsi_mod
usbcore usb_common i8042 serio
[71156.927967] CPU: 3 PID: 1199 Comm: CPU 1/KVM Tainted: G        W      
4.12.0-rc3-mainline #1
[71156.927967] Hardware name: LENOVO 20F6007RGE/20F6007RGE, BIOS R02ET48W (1.21
) 06/01/2016
[71156.927968] task: ffff8801102e5880 task.stack: ffffc90003d98000
[71156.927979] RIP: 0010:intel_vgpu_emulate_mmio_write+0x606/0x630 [i915]
[71156.927980] RSP: 0018:ffffc90003d9b940 EFLAGS: 00010286
[71156.927981] RAX: 0000000000000043 RBX: 0000000000000008 RCX:
ffffffff81a55a08
[71156.927982] RDX: 0000000000000000 RSI: 0000000000000082 RDI:
0000000000000247
[71156.927983] RBP: ffffc90003d9b998 R08: 0000000000000043 R09:
0000000000008165
[71156.927983] R10: ffffc90003d9b920 R11: 0000000000000000 R12:
ffff880235eec000
[71156.927984] R13: ffff8801102d2810 R14: ffffc900012cd000 R15:
ffffc9000a8fb010
[71156.927985] FS:  00007fe0781ff700(0000) GS:ffff880242580000(0000)
knlGS:00000024879ab000
[71156.927986] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[71156.927986] CR2: ffffd10de9b8e000 CR3: 00000001100f2000 CR4:
00000000003426e0
[71156.927987] Call Trace:
[71156.927992]  ? __check_object_size+0xae/0x18b
[71156.927994]  kvmgt_page_track_write+0x64/0x70 [kvmgt]
[71156.928001]  kvm_page_track_write+0x7e/0xb0 [kvm]
[71156.928008]  emulator_write_phys+0x3b/0x50 [kvm]
[71156.928015]  write_emulate+0xe/0x10 [kvm]
[71156.928021]  emulator_read_write_onepage+0x19f/0x320 [kvm]
[71156.928027]  emulator_read_write+0xcd/0x180 [kvm]
[71156.928033]  emulator_write_emulated+0x15/0x20 [kvm]
[71156.928040]  segmented_write+0x59/0x80 [kvm]
[71156.928046]  writeback+0x12d/0x210 [kvm]
[71156.928052]  x86_emulate_insn+0x82a/0xf10 [kvm]
[71156.928058]  ? x86_decode_insn+0x50a/0x1270 [kvm]
[71156.928064]  x86_emulate_instruction+0x1df/0x720 [kvm]
[71156.928072]  kvm_mmu_page_fault+0xa2/0x120 [kvm]
[71156.928075]  handle_ept_violation+0x9e/0x150 [kvm_intel]
[71156.928077]  vmx_handle_exit+0xad/0x1420 [kvm_intel]
[71156.928084]  ? apic_set_eoi+0xbc/0x200 [kvm]
[71156.928090]  ? kvm_lapic_sync_from_vapic+0xcd/0x190 [kvm]
[71156.928097]  kvm_arch_vcpu_ioctl_run+0x8e2/0x1690 [kvm]
[71156.928103]  ? kvm_arch_vcpu_load+0x62/0x240 [kvm]
[71156.928109]  kvm_vcpu_ioctl+0x339/0x630 [kvm]
[71156.928114]  ? kvm_vcpu_ioctl+0x339/0x630 [kvm]
[71156.928117]  ? vfio_device_fops_read+0x24/0x30 [vfio]
[71156.928119]  do_vfs_ioctl+0xa3/0x5f0
[71156.928121]  ? __fget+0x6e/0x90
[71156.928122]  SyS_ioctl+0x79/0x90
[71156.928124]  entry_SYSCALL_64_fastpath+0x1a/0xa5
[71156.928125] RIP: 0033:0x7fe084c340d7
[71156.928126] RSP: 002b:00007fe0781fe8e8 EFLAGS: 00000246 ORIG_RAX:
0000000000000010
[71156.928127] RAX: ffffffffffffffda RBX: 000000000000ae80 RCX:
00007fe084c340d7
[71156.928128] RDX: 0000000000000000 RSI: 000000000000ae80 RDI:
0000000000000011
[71156.928128] RBP: 00007fe078234000 R08: 0000558fd2aa7030 R09:
00000000000000ff
[71156.928129] R10: 0000000000004260 R11: 0000000000000246 R12:
0000000000000000
[71156.928129] R13: 00007fe08bc87000 R14: 0000000000000000 R15:
00007fe078234000
[71156.928131] Code: 8b 54 05 fc 89 54 05 c4 e9 ba fd ff ff 48 c7 c6 90 41 6c
a0 48 c7 c7 52 a8 6a a0 4c 89 4d c0 c6 05 9f aa 05 00 01 e8 de 90 b0 e0 <0f> ff
4c 8b 4d c0 e9 e8 fe ff ff 89 d8 41 0f b7 54 05 fe 66 89 
[71156.928153] ---[ end trace 57fca0ef330078ee ]---

i915 module is loaded with enable_gvt=1 and MDEV created in i915-GVTg_V5_2

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are on the CC list for the bug.
You are the assignee for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/intel-gfx-bugs/attachments/20170530/8de82a65/attachment-0001.html>


More information about the intel-gfx-bugs mailing list