[Bug 104980] New: NULL pointer in drm_dp_mst_wait_tx_reply / hotplugging via DP MST hub causes oops

bugzilla-daemon at freedesktop.org bugzilla-daemon at freedesktop.org
Tue Feb 6 23:17:43 UTC 2018


https://bugs.freedesktop.org/show_bug.cgi?id=104980

            Bug ID: 104980
           Summary: NULL pointer in drm_dp_mst_wait_tx_reply / hotplugging
                    via DP MST hub causes oops
           Product: DRI
           Version: unspecified
          Hardware: x86-64 (AMD64)
                OS: Linux (All)
            Status: NEW
          Severity: major
          Priority: medium
         Component: DRM/Intel
          Assignee: intel-gfx-bugs at lists.freedesktop.org
          Reporter: a.nielsen at shikadi.net
        QA Contact: intel-gfx-bugs at lists.freedesktop.org
                CC: intel-gfx-bugs at lists.freedesktop.org

Steps to reproduce:

1. Connect DisplayPort MST hub to Intel NUC
2. Connect DisplayPort monitors to MST hub
3. Activate displays
4. Remove displays (power cycling them is good enough, but removing and
reconnecting the DisplayPort cable also seems to work)
5. When displays are powered on again/reconnected, there is no signal, but any
non-MST-connected monitors are still usable
6. Power cycling the displays a second time causes a kernel oops
7. MST monitors still have no signal, non-MST monitors freeze (show a picture
but no updates, mouse cursor doesn't move, etc.)
8. SSHing into the machine is possible, however rebooting or shutting down the
machine never finishes, it must be power cycled.

This can be reproduced 100% of the time.  Note that power cycling means off at
the mains, using the monitors' soft-power buttons doesn't seem to be a problem.

Upgraded to kernel 4.14.14 but still have the issue.  System is an Intel
NUC5i3RYK.  Have only tested with Lenovo LT2452p monitors.

Please advise if you need any further info.  I am assuming that if you have
access to a DisplayPort MST hub then you will be able to reproduce the issue
pretty easily by experimenting with hotplugging an active DisplayPort monitor.

Looks like it's a failure querying the EDID info from the monitor?

Here is dmesg after the failure:

[  547.671668] BUG: unable to handle kernel NULL pointer dereference at
0000000000000320
[  547.671682] IP: mutex_lock+0x10/0x20
[  547.671684] PGD 0 P4D 0
[  547.671689] Oops: 0002 [#1] PREEMPT SMP PTI
[  547.671692] Modules linked in: cmac md4 xt_nat xt_tcpudp veth xfs
ipt_MASQUERADE nf_nat_masquerade_ipv4 nf_conntrack_netlink nfnetlink xfrm_user
xfrm_algo iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 xt_addrtype
iptable_filter xt_conntrack nf_nat nf_conntrack br_netfilter bridge stp llc
dm_thin_pool dm_persistent_data dm_bio_prison dm_bufio libcrc32c crc32c_generic
loop dm_mod nls_utf8 cifs ccm dns_resolver fscache arc4 nct6775 hwmon_vid
iwlmvm snd_hda_codec_hdmi intel_rapl x86_pkg_temp_thermal intel_powerclamp
mac80211 coretemp nls_iso8859_1 nls_cp437 vfat fat kvm_intel iwlwifi i915 kvm
irqbypass iTCO_wdt iTCO_vendor_support i2c_algo_bit drm_kms_helper
crct10dif_pclmul evdev crc32_pclmul ghash_clmulni_intel mac_hid pcbc
snd_hda_codec_realtek drm cfg80211 snd_hda_codec_generic aesni_intel          
[  547.671754]  aes_x86_64 e1000e crypto_simd snd_hda_intel glue_helper cryptd
intel_cstate intel_rapl_perf mei_me snd_soc_ssm4567 snd_hda_codec pcspkr
snd_soc_rt5640 intel_gtt snd_soc_rl6231 agpgart i2c_i801 tpm_tis shpchp lpc_ich
syscopyarea ptp ir_rc6_decoder mei tpm_tis_core sysfillrect snd_hda_core
pps_core sysimgblt fb_sys_fops thermal fan tpm btusb snd_hwdep rc_rc6_mce
snd_soc_core btrtl ir_lirc_codec lirc_dev btbcm btintel nuvoton_cir battery
snd_compress rc_core snd_pcm_dmaengine snd_soc_sst_acpi snd_pcm video
snd_soc_sst_match elan_i2c bluetooth snd_timer i2c_hid acpi_als 8250_dw snd
kfifo_buf button soundcore industrialio ecdh_generic hid rfkill ac97_bus
spi_pxa2xx_platform acpi_pad nfsd auth_rpcgss oid_registry nfs_acl lockd grace
sunrpc sch_fq_codel ip_tables x_tables ext4 crc16 mbcache              
[  547.671824]  jbd2 fscrypto sd_mod ahci libahci ehci_pci xhci_pci ehci_hcd
libata xhci_hcd crc32c_intel scsi_mod usbcore usb_common sdhci_acpi sdhci serio
led_class mmc_core         
[  547.671845] CPU: 1 PID: 475 Comm: Xorg Not tainted 4.14.14-1-ARCH #1
[  547.671847] Hardware name:                  /NUC5i3RYB, BIOS
RYBDWi35.86A.0361.2016.1202.1005 12/02/2016
[  547.671849] task: ffff9bcb7fa16740 task.stack: ffffaacfc178c000
[  547.671854] RIP: 0010:mutex_lock+0x10/0x20
[  547.671856] RSP: 0018:ffffaacfc178f8e8 EFLAGS: 00010246
[  547.671859] RAX: 0000000000000000 RBX: 00000000000004ac RCX:
ffffaacfc2cffdd8
[  547.671862] RDX: ffff9bcb7fa16740 RSI: 0000000000000287 RDI:
0000000000000320
[  547.671864] RBP: ffff9bcb684ecc00 R08: ffff9bcb96c18d90 R09:
00000000000001ed
[  547.671866] R10: ffffaacfc178f8f0 R11: 00000000000000d4 R12:
ffff9bcb729b48a0
[  547.671869] R13: ffff9bcb8e4afa40 R14: 00000000000004ac R15:
ffffaacfc178fab0
[  547.671872] FS:  00007fc12387c940(0000) GS:ffff9bcb96c80000(0000)
knlGS:0000000000000000
[  547.671875] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  547.671877] CR2: 0000000000000320 CR3: 0000000141dd6004 CR4:
00000000003606e0
[  547.671880] Call Trace:
[  547.671896]  drm_dp_mst_wait_tx_reply+0x140/0x1e0 [drm_kms_helper]
[  547.671903]  ? wait_woken+0x80/0x80
[  547.671912]  drm_dp_mst_i2c_xfer+0x1a0/0x260 [drm_kms_helper]
[  547.671918]  __i2c_transfer+0x120/0x430
[  547.671922]  i2c_transfer+0x51/0xd0
[  547.671944]  drm_do_probe_ddc_edid+0xbc/0x140 [drm]
[  547.671960]  ? drm_rgb_quant_range_selectable+0x100/0x100 [drm]
[  547.671974]  ? drm_do_get_edid+0x61/0x2c0 [drm]
[  547.671986]  ? drm_rgb_quant_range_selectable+0x100/0x100 [drm]
[  547.671998]  drm_do_get_edid+0x61/0x2c0 [drm]
[  547.672011]  drm_get_edid+0x52/0x3d0 [drm]
[  547.672021]  drm_dp_mst_get_edid+0x68/0x80 [drm_kms_helper]
[  547.672066]  intel_dp_mst_get_modes+0x29/0x50 [i915]
[  547.672079]  drm_helper_probe_single_connector_modes+0x5b0/0x770
[drm_kms_helper]
[  547.672095]  drm_mode_getconnector+0x156/0x320 [drm]
[  547.672111]  ? drm_mode_connector_property_set_ioctl+0x60/0x60 [drm]
[  547.672124]  drm_ioctl_kernel+0x5b/0xb0 [drm]
[  547.672137]  drm_ioctl+0x2d5/0x370 [drm]
[  547.672150]  ? drm_mode_connector_property_set_ioctl+0x60/0x60 [drm]
[  547.672156]  do_vfs_ioctl+0xa4/0x630
[  547.672161]  ? __sys_recvmsg+0x4e/0x90
[  547.672164]  ? __sys_recvmsg+0x7d/0x90
[  547.672168]  SyS_ioctl+0x74/0x80
[  547.672173]  entry_SYSCALL_64_fastpath+0x20/0x83
[  547.672176] RIP: 0033:0x7fc121122d27
[  547.672178] RSP: 002b:00007ffc9fc5c828 EFLAGS: 00000246
[  547.672181] Code: 17 a0 ff 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 be 02 00
00 00 e9 e1 fa ff ff 90 0f 1f 44 00 00 65 48 8b 14 25 00 5c 01 00 31 c0 <f0> 48
0f b1 17 48 85 c0 75 02 f3 c3 eb d2 66 90 0f 1f 44 00 00 
[  547.672233] RIP: mutex_lock+0x10/0x20 RSP: ffffaacfc178f8e8
[  547.672234] CR2: 0000000000000320
[  547.672237] ---[ end trace 1f8e5b72c7c997de ]---
[  547.922211] BUG: unable to handle kernel NULL pointer dereference at
00000000000003a8
[  547.922223] IP: queue_work_on+0x17/0x40
[  547.922225] PGD 0 P4D 0
[  547.922230] Oops: 0002 [#2] PREEMPT SMP PTI
[  547.922233] Modules linked in: cmac md4 xt_nat xt_tcpudp veth xfs
ipt_MASQUERADE nf_nat_masquerade_ipv4 nf_conntrack_netlink nfnetlink xfrm_user
xfrm_algo iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 xt_addrtype
iptable_filter xt_conntrack nf_nat nf_conntrack br_netfilter bridge stp llc
dm_thin_pool dm_persistent_data dm_bio_prison dm_bufio libcrc32c crc32c_generic
loop dm_mod nls_utf8 cifs ccm dns_resolver fscache arc4 nct6775 hwmon_vid
iwlmvm snd_hda_codec_hdmi intel_rapl x86_pkg_temp_thermal intel_powerclamp
mac80211 coretemp nls_iso8859_1 nls_cp437 vfat fat kvm_intel iwlwifi i915 kvm
irqbypass iTCO_wdt iTCO_vendor_support i2c_algo_bit drm_kms_helper
crct10dif_pclmul evdev crc32_pclmul ghash_clmulni_intel mac_hid pcbc
snd_hda_codec_realtek drm cfg80211 snd_hda_codec_generic aesni_intel
[  547.922297]  aes_x86_64 e1000e crypto_simd snd_hda_intel glue_helper cryptd
intel_cstate intel_rapl_perf mei_me snd_soc_ssm4567 snd_hda_codec pcspkr
snd_soc_rt5640 intel_gtt snd_soc_rl6231 agpgart i2c_i801 tpm_tis shpchp lpc_ich
syscopyarea ptp ir_rc6_decoder mei tpm_tis_core sysfillrect snd_hda_core
pps_core sysimgblt fb_sys_fops thermal fan tpm btusb snd_hwdep rc_rc6_mce
snd_soc_core btrtl ir_lirc_codec lirc_dev btbcm btintel nuvoton_cir battery
snd_compress rc_core snd_pcm_dmaengine snd_soc_sst_acpi snd_pcm video
snd_soc_sst_match elan_i2c bluetooth snd_timer i2c_hid acpi_als 8250_dw snd
kfifo_buf button soundcore industrialio ecdh_generic hid rfkill ac97_bus
spi_pxa2xx_platform acpi_pad nfsd auth_rpcgss oid_registry nfs_acl lockd grace
sunrpc sch_fq_codel ip_tables x_tables ext4 crc16 mbcache
[  547.922376]  jbd2 fscrypto sd_mod ahci libahci ehci_pci xhci_pci ehci_hcd
libata xhci_hcd crc32c_intel scsi_mod usbcore usb_common sdhci_acpi sdhci serio
led_class mmc_core
[  547.922396] CPU: 0 PID: 465 Comm: kworker/u8:8 Tainted: G      D        
4.14.14-1-ARCH #1
[  547.922398] Hardware name:                  /NUC5i3RYB, BIOS
RYBDWi35.86A.0361.2016.1202.1005 12/02/2016
[  547.922446] Workqueue: i915-dp i915_digport_work_func [i915]
[  547.922450] task: ffff9bcb81ecc9c0 task.stack: ffffaacfc1744000
[  547.922455] RIP: 0010:queue_work_on+0x17/0x40
[  547.922458] RSP: 0018:ffffaacfc1747c60 EFLAGS: 00010002
[  547.922461] RAX: 0000000000000202 RBX: 0000000000000202 RCX:
0000000000000000
[  547.922464] RDX: 00000000000003a8 RSI: ffff9bcb92007000 RDI:
0000000000000080
[  547.922466] RBP: ffffaacfc1747d20 R08: ffffffff853f3c77 R09:
0000000000000001
[  547.922469] R10: ffff9bcb6c741b88 R11: 000000000000000a R12:
ffff9bcb8e4af89e
[  547.922471] R13: ffff9bcb86795000 R14: 0000000000000001 R15:
ffff9bcb729b48a0
[  547.922475] FS:  0000000000000000(0000) GS:ffff9bcb96c00000(0000)
knlGS:0000000000000000
[  547.922477] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  547.922480] CR2: 00000000000003a8 CR3: 000000002a00a004 CR4:
00000000003606f0
[  547.922486] Call Trace:
[  547.922501]  drm_dp_mst_handle_up_req+0x4fc/0x5b0 [drm_kms_helper]
[  547.922513]  ? drm_dp_mst_hpd_irq+0x60/0x890 [drm_kms_helper]
[  547.922521]  drm_dp_mst_hpd_irq+0x60/0x890 [drm_kms_helper]
[  547.922566]  ? intel_dp_check_mst_status+0x114/0x1f0 [i915]
[  547.922599]  intel_dp_check_mst_status+0x114/0x1f0 [i915]
[  547.922629]  intel_dp_hpd_pulse+0x19c/0x310 [i915]
[  547.922653]  i915_digport_work_func+0x86/0x110 [i915]
[  547.922658]  process_one_work+0x1e0/0x420
[  547.922661]  worker_thread+0x2b/0x3d0
[  547.922665]  ? process_one_work+0x420/0x420
[  547.922670]  kthread+0x11a/0x130
[  547.922673]  ? kthread_create_on_node+0x70/0x70
[  547.922676]  ret_from_fork+0x32/0x40
[  547.922679] Code: 85 e8 b9 44 04 00 e9 78 ff ff ff 66 0f 1f 84 00 00 00 00
00 0f 1f 44 00 00 53 9c 58 0f 1f 44 00 00 48 89 c3 fa 66 0f 1f 44 00 00 <f0> 0f
ba 2a 00 73 10 31 c9 48 89 df 57 9d 0f 1f 44 00 00 89 c8 
[  547.922717] RIP: queue_work_on+0x17/0x40 RSP: ffffaacfc1747c60
[  547.922719] CR2: 00000000000003a8
[  547.922721] ---[ end trace 1f8e5b72c7c997df ]---

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are on the CC list for the bug.
You are the QA Contact for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/intel-gfx-bugs/attachments/20180206/d4b111a0/attachment.html>


More information about the intel-gfx-bugs mailing list