[Intel-gfx] NULL pointer deferences in drm_mode_copy() and drm_crtc_index()

Daniel Vetter daniel at ffwll.ch
Mon Jul 6 08:24:08 PDT 2015


On Fri, Jul 03, 2015 at 02:11:37PM -0400, Michael Kaminsky wrote:
> I few days ago I built a kernel from git (commit 6aaf0da872), and
> noticed a couple of NULL pointer deferences.  These seem to be
> regressions as they aren't present in v4.1.
> 
> I did a bisect between v4.1 and 6aaf0da872, and came up with the
> following commit as the first bad one:
> 
>  d5432a9d  drm/i915: Stage new modeset state straight into atomic state
> 
> My laptop is a Thinkpad T540p.  The bug manifests itself specifically
> when I'm connected to my dock.  Starting with this commit, when I plug
> an external monitor into the dock and then unplug it, I get the NULL
> pointer dereference in drm_mode_copy (see kernel trace #1 below).  The
> bug happens during unplug.
> 
> Plugging/unplugging the same monitor directly into my laptop doesn't
> seem tickle the bug.  It also doesn't seem to matter which connector I
> plug/unplug into on the dock (VGA, DP, etc.).
> 
> This laptop/dock uses DP MST, so wonder if that's the problem.  An
> external VGA monitor connected directly to my laptop shows up as output
> VGA1, but when that same monitor is hooked up to the dock's VGA port, it
> shows up as output DP2-3 (for example).
> 
> That commit the first place where things seem to go wrong, but later
> commits actually show a different, but possibly related NULL pointer
> dereference in drm_crtc_index (see kernel trace #2 below).  In these
> kernels, I don't even get to the point where I can unplug the monitor.
> Instead, as soon as I connect two external monitors to my dock, a
> NULL dereference occurs.  My initial tests show that it seems to
> happen specifically with 2 external monitors, not 1, and when they are
> connected to the dock, not the laptop itself.  This bug occurs in commit
> 6aaf0da872 (my starting point), and I noticed it during my bisect in at
> least commit 27a1b688, though it might first start occurring earlier.
> I know that 0f63cca already has the first bug above (unplugging
> monitor problem).  I suspect that the new problem probably starts
> between those two commits, but I haven't had the chance to pinpoint
> it--perhaps this info will be enough to identify the source of both
> problems, but if not, I can try to dig deeper.

Yeah mst dp hotplugs connectors, and we've changed a few things in there.
Can you please boot with drm.debug=0xe added to your kernel cmdline,
reproduce each issue and the grab the complete kernel log for each case?
It'll be really big but should help figuring out what's amiss.

Also please retest with latest drm-next or upstream linus, we've just
merged a few patches to close some dp mst races.

Thanks, Daniel

> 
> Michael
> 
> 
> Here are the two kernel traces:
> 
> ===
> #1
> ===
> 
>   BUG: unable to handle kernel NULL pointer dereference at           (null)
>   IP: [<ffffffffa03de078>] drm_mode_copy+0x18/0x30 [drm]
>   PGD 0   Oops: 0000 [#1] SMP   Modules linked in: fuse btrfs xor raid6_pq ufs qnx4 hfsplus hfs minix ntfs msdos jfs xfs libcrc32c dm_mod cpuid hid_generic usbhid hid rfcomm cpufreq_stats cpufreq_conservative cpufreq_userspace cpufreq_powersave bnep binfmt_misc joydev ip6t_REJECT nf_reject_ipv6 nf_log_ipv6 xt_hl ip6t_rt nf_conntrack_ipv6 nf_defrag_ipv6 ipt_REJECT nf_reject_ipv4 nf_log_ipv4 nf_log_common xt_LOG xt_limit xt_tcpudp xt_addrtype nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack ip6table_filter ip6_tables nf_conntrack_netbios_ns nf_conntrack_broadcast nf_nat_ftp nf_nat nf_conntrack_ftp nf_conntrack iptable_filter ip_tables x_tables nls_utf8 nls_cp437 vfat fat arc4 snd_hda_codec_hdmi x86_pkg_temp_thermal intel_powerclamp intel_rapl iosf_mbi kvm_intel kvm uvcvideo crct10dif_pclmul videobuf2_vmalloc snd_hda_codec_realtek snd_hda_codec_generic iwlmvm crc32_pclmul videobuf2_memops ghash_clmulni_intel videobuf2_core v4l2_common snd_hda_intel aesni_intel iTCO_wdt i915 videod!
>  ev mac8021
> 1 iTCO_vendor_support aes_x86_
> 64 snd_hda_controller lrw rtsx_pci_ms btusb media snd_hda_codec gf128mul btbcm snd_hda_core iwlwifi memstick btintel glue_helper snd_hwdep bluetooth thinkpad_acpi ablk_helper drm_kms_helper mei_me snd_pcm nvram ie31200_edac cryptd pcspkr evdev cfg80211 sg drm psmouse snd_timer i2c_i801 edac_core mei lpc_ich i2c_algo_bit shpchp snd serio_raw efivars soundcore rfkill wmi ac tpm_tis battery tpm video processor button coretemp parport_pc ppdev lp parport efivarfs autofs4 ext4 crc16 mbcache jbd2 sr_mod cdrom sd_mod rtsx_pci_sdmmc mmc_core crc32c_intel ahci libahci libata xhci_pci scsi_mod ehci_pci xhci_hcd ehci_hcd rtsx_pci e1000e mfd_core ptp usbcore pps_core usb_common thermal thermal_sys
>   CPU: 0 PID: 1007 Comm: Xorg Not tainted 4.1.0-rc2-kaminsky+ #9
>   Hardware name: LENOVO 20BECTO1WW/20BECTO1WW, BIOS GMET66WW (2.14 ) 07/01/2014
>   task: ffff880231890290 ti: ffff8800b5df4000 task.ti: ffff8800b5df4000
>   RIP: 0010:[<ffffffffa03de078>]  [<ffffffffa03de078>] drm_mode_copy+0x18/0x30 [drm]
>   RSP: 0018:ffff8800b5df7c10  EFLAGS: 00010292
>   RAX: ffff880231535018 RBX: ffff880230601e40 RCX: 000000000000001a
>   RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff880231535018
>   RBP: ffff880231969000 R08: 0000000000000000 R09: 0000000000000000
>   R10: 0000000000000000 R11: 0000000000000000 R12: ffff880231535000
>   R13: ffff880231535018 R14: ffff8802315350e8 R15: 0000000000000000
>   FS:  00007f30e84ba980(0000) GS:ffff88023e200000(0000) knlGS:0000000000000000
>   CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>   CR2: 0000000000000000 CR3: 0000000036a33000 CR4: 00000000001407f0
>   Stack:
>    ffffffffa07e3a70 ffff880230601e40 0000000000000000 0000000000000000
>    ffffffffa03e74d8 0000000000005a85 0000000030601e40 0000000000000000
>    ffff880232d29000 ffff880231969000 0000000000000000 ffff880230601e40
>   Call Trace:
>    [<ffffffffa07e3a70>] ? intel_modeset_compute_config.part.91+0x270/0xc50 [i915]
>    [<ffffffffa03e74d8>] ? drm_modeset_lock+0x38/0x100 [drm]
>    [<ffffffffa07e9543>] ? intel_crtc_set_config+0x673/0xa10 [i915]
>    [<ffffffffa03d8557>] ? drm_mode_set_config_internal+0x67/0x100 [drm]
>    [<ffffffffa03dc6fa>] ? drm_mode_setcrtc+0x22a/0x600 [drm]
>    [<ffffffffa03cd662>] ? drm_ioctl+0x172/0x5c0 [drm]
>    [<ffffffff81208b57>] ? fsnotify+0x3d7/0x590
>    [<ffffffff811ddc98>] ? do_vfs_ioctl+0x2e8/0x4f0
>    [<ffffffff811cc070>] ? __sb_end_write+0x30/0x70
>    [<ffffffff811c9e13>] ? vfs_write+0x183/0x1b0
>    [<ffffffff811ddf21>] ? SyS_ioctl+0x81/0xa0
>    [<ffffffff81578af2>] ? system_call_fastpath+0x16/0x75
>   Code: 00 00 00 f3 c3 0f 1f 40 00 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 44 8b 4f 10 4c 8b 07 48 89 f8 48 8b 57 08 b9 1a 00 00 00 <f3> 48 a5 44 89 48 10 4c 89 00 48 89 50 08 c3 66 0f 1f 84 00 00   RIP  [<ffffffffa03de078>] drm_mode_copy+0x18/0x30 [drm]
>    RSP <ffff8800b5df7c10>
>   CR2: 0000000000000000
>   ---[ end trace 6de18f388749dd59 ]---
> 
> 
> ===
> #2  (I'm include the WARNINGs that precede the BUG itself)
> ===
> 
>   thinkpad_acpi: docked into hotplug port replicator
>   ------------[ cut here ]------------
>   WARNING: CPU: 4 PID: 1054 at drivers/gpu/drm/i915/intel_display.c:12256 intel_modeset_check_state+0x1f2/0xb30 [i915]()
>   active encoder's pipe doesn't match(expected 1, found 0)
>   Modules linked in: rfcomm bnep cpufreq_stats cpufreq_conservative cpufreq_userspace cpufreq_powersave binfmt_misc joydev ip6t_REJECT nf_reject_ipv6 nf_log_ipv6 xt_hl ip6t_rt nf_conntrack_ipv6 nf_defrag_ipv6 ipt_REJECT nf_reject_ipv4 nf_log_ipv4 nf_log_common xt_LOG xt_limit xt_tcpudp xt_addrtype nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack ip6table_filter ip6_tables nf_conntrack_netbios_ns nf_conntrack_broadcast nf_nat_ftp nf_nat nf_conntrack_ftp nf_conntrack iptable_filter ip_tables x_tables nls_utf8 nls_cp437 vfat fat arc4 x86_pkg_temp_thermal intel_powerclamp intel_rapl iosf_mbi kvm_intel kvm snd_hda_codec_hdmi crct10dif_pclmul crc32_pclmul ghash_clmulni_intel sha256_ssse3 sha256_generic hmac iwlmvm drbg mac80211 ansi_cprng iTCO_wdt snd_hda_codec_realtek uvcvideo iTCO_vendor_support snd_hda_codec_generic aesni_intel videobuf2_vmalloc aes_x86_64 videobuf2_memops videobuf2_core lrw v4l2_common gf128mul glue_helper videodev i915 snd_hda_intel ablk_helper btusb cryptd iwlw!
>  ifi media 
> snd_hda_codec rtsx_pci_ms btrtl memstick snd_hda_core btbcm snd_hwdep btintel cfg80211 evdev thinkpad_acpi snd_pcm drm_kms_helper bluetooth nvram snd_timer pcspkr drm mei_me snd sg mei i2c_algo_bit ie31200_edac psmouse shpchp lpc_ich edac_core i2c_i801 serio_raw efivars soundcore rfkill wmi ac tpm_tis battery tpm video processor button coretemp parport_pc ppdev lp parport efivarfs autofs4 ext4 crc16 mbcache jbd2 sr_mod cdrom sd_mod rtsx_pci_sdmmc mmc_core crc32c_intel ahci libahci libata xhci_pci scsi_mod ehci_pci xhci_hcd ehci_hcd rtsx_pci e1000e mfd_core ptp usbcore pps_core thermal usb_common thermal_sys
>   CPU: 4 PID: 1054 Comm: Xorg Not tainted 4.1.0-kaminsky+ #2
>   Hardware name: LENOVO 20BECTO1WW/20BECTO1WW, BIOS GMET66WW (2.14 ) 07/01/2014
>    0000000000000000 ffffffffa0741720 ffffffff8153c9e9 ffff8800b560fba8
>    ffffffff8106cb71 ffff8800b52bc700 ffff8800b5cb8000 0000000000000001
>    ffff8800b5cb8350 ffff8800b5cb8338 ffffffff8106cbea ffffffffa0744d98
>   Call Trace:
>    [<ffffffff8153c9e9>] ? dump_stack+0x40/0x50
>    [<ffffffff8106cb71>] ? warn_slowpath_common+0x81/0xb0
>    [<ffffffff8106cbea>] ? warn_slowpath_fmt+0x4a/0x50
>    [<ffffffffa06e9e62>] ? intel_modeset_check_state+0x1f2/0xb30 [i915]
>    [<ffffffffa06eb534>] ? intel_crtc_set_config+0x544/0x620 [i915]
>    [<ffffffffa0417dec>] ? drm_crtc_check_viewport+0x2c/0xe0 [drm]
>    [<ffffffffa041970e>] ? drm_mode_set_config_internal+0x5e/0xf0 [drm]
>    [<ffffffffa041d7eb>] ? drm_mode_setcrtc+0x17b/0x4d0 [drm]
>    [<ffffffffa040f202>] ? drm_ioctl+0x172/0x550 [drm]
>    [<ffffffff811f4384>] ? fsnotify+0x3b4/0x500
>    [<ffffffff811c9cd3>] ? do_vfs_ioctl+0x2c3/0x4a0
>    [<ffffffff811b964d>] ? __sb_end_write+0x2d/0x60
>    [<ffffffff811b7580>] ? vfs_write+0x170/0x190
>    [<ffffffff811c9f26>] ? SyS_ioctl+0x76/0x90
>    [<ffffffff81542372>] ? entry_SYSCALL_64_fastpath+0x16/0x75
>   ---[ end trace 4e7ded8d5c7f57e9 ]---
>   ------------[ cut here ]------------
>   WARNING: CPU: 0 PID: 1054 at drivers/gpu/drm/i915/intel_display.c:12256 intel_modeset_check_state+0x1f2/0xb30 [i915]()
>   active encoder's pipe doesn't match(expected 1, found 0)
>   Modules linked in: hid_generic usbhid hid rfcomm bnep cpufreq_stats cpufreq_conservative cpufreq_userspace cpufreq_powersave binfmt_misc joydev ip6t_REJECT nf_reject_ipv6 nf_log_ipv6 xt_hl ip6t_rt nf_conntrack_ipv6 nf_defrag_ipv6 ipt_REJECT nf_reject_ipv4 nf_log_ipv4 nf_log_common xt_LOG xt_limit xt_tcpudp xt_addrtype nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack ip6table_filter ip6_tables nf_conntrack_netbios_ns nf_conntrack_broadcast nf_nat_ftp nf_nat nf_conntrack_ftp nf_conntrack iptable_filter ip_tables x_tables nls_utf8 nls_cp437 vfat fat arc4 x86_pkg_temp_thermal intel_powerclamp intel_rapl iosf_mbi kvm_intel kvm snd_hda_codec_hdmi crct10dif_pclmul crc32_pclmul ghash_clmulni_intel sha256_ssse3 sha256_generic hmac iwlmvm drbg mac80211 ansi_cprng iTCO_wdt snd_hda_codec_realtek uvcvideo iTCO_vendor_support snd_hda_codec_generic aesni_intel videobuf2_vmalloc aes_x86_64 videobuf2_memops videobuf2_core lrw v4l2_common gf128mul glue_helper videodev i915 snd_hda_intel ablk_h!
>  elper btus
> 
> b cryptd iwlwifi media snd_hda_codec rtsx_pci_ms btrtl memstick snd_hda_core btbcm snd_hwdep btintel cfg80211 evdev thinkpad_acpi snd_pcm drm_kms_helper bluetooth nvram snd_timer pcspkr drm mei_me snd sg mei i2c_algo_bit ie31200_edac psmouse shpchp lpc_ich edac_core i2c_i801 serio_raw efivars soundcore rfkill wmi ac tpm_tis battery tpm video processor button coretemp parport_pc ppdev lp parport efivarfs autofs4 ext4 crc16 mbcache jbd2 sr_mod cdrom sd_mod rtsx_pci_sdmmc mmc_core crc32c_intel ahci libahci libata xhci_pci scsi_mod ehci_pci xhci_hcd ehci_hcd rtsx_pci e1000e mfd_core ptp usbcore pps_core thermal usb_common thermal_sys
>   CPU: 0 PID: 1054 Comm: Xorg Tainted: G        W       4.1.0-kaminsky+ #2
>   Hardware name: LENOVO 20BECTO1WW/20BECTO1WW, BIOS GMET66WW (2.14 ) 07/01/2014
>    0000000000000000 ffffffffa0741720 ffffffff8153c9e9 ffff8800b560fba8
>    ffffffff8106cb71 ffff8800b52bc700 ffff8800b5cb8000 0000000000000001
>    ffff8800b5cb8350 ffff8800b5cb8338 ffffffff8106cbea ffffffffa0744d98
>   Call Trace:
>    [<ffffffff8153c9e9>] ? dump_stack+0x40/0x50
>    [<ffffffff8106cb71>] ? warn_slowpath_common+0x81/0xb0
>    [<ffffffff8106cbea>] ? warn_slowpath_fmt+0x4a/0x50
>    [<ffffffffa06e9e62>] ? intel_modeset_check_state+0x1f2/0xb30 [i915]
>    [<ffffffffa06eb534>] ? intel_crtc_set_config+0x544/0x620 [i915]
>    [<ffffffffa0417dec>] ? drm_crtc_check_viewport+0x2c/0xe0 [drm]
>    [<ffffffffa041970e>] ? drm_mode_set_config_internal+0x5e/0xf0 [drm]
>    [<ffffffffa041d7eb>] ? drm_mode_setcrtc+0x17b/0x4d0 [drm]
>    [<ffffffffa040f202>] ? drm_ioctl+0x172/0x550 [drm]
>    [<ffffffff811c9cd3>] ? do_vfs_ioctl+0x2c3/0x4a0
>    [<ffffffff81077597>] ? recalc_sigpending+0x17/0x50
>    [<ffffffff811c9f26>] ? SyS_ioctl+0x76/0x90
>    [<ffffffff81542372>] ? entry_SYSCALL_64_fastpath+0x16/0x75
>   ---[ end trace 4e7ded8d5c7f57ea ]---
>   BUG: unable to handle kernel NULL pointer dereference at           (null)
>   IP: [<ffffffffa0418375>] drm_crtc_index+0x5/0x50 [drm]
>   PGD 0   Oops: 0000 [#1] SMP   Modules linked in: hid_generic usbhid hid rfcomm bnep cpufreq_stats cpufreq_conservative cpufreq_userspace cpufreq_powersave binfmt_misc joydev ip6t_REJECT nf_reject_ipv6 nf_log_ipv6 xt_hl ip6t_rt nf_conntrack_ipv6 nf_defrag_ipv6 ipt_REJECT nf_reject_ipv4 nf_log_ipv4 nf_log_common xt_LOG xt_limit xt_tcpudp xt_addrtype nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack ip6table_filter ip6_tables nf_conntrack_netbios_ns nf_conntrack_broadcast nf_nat_ftp nf_nat nf_conntrack_ftp nf_conntrack iptable_filter ip_tables x_tables nls_utf8 nls_cp437 vfat fat arc4 x86_pkg_temp_thermal intel_powerclamp intel_rapl iosf_mbi kvm_intel kvm snd_hda_codec_hdmi crct10dif_pclmul crc32_pclmul ghash_clmulni_intel sha256_ssse3 sha256_generic hmac iwlmvm drbg mac80211 ansi_cprng iTCO_wdt snd_hda_codec_realtek uvcvideo iTCO_vendor_support snd_hda_codec_generic aesni_intel videobuf2_vmalloc aes_x86_64 videobuf2_memops videobuf2_core lrw v4l2_common gf128mul glue_helper vide!
>  odev i915 
> snd_hda_intel ablk_helper btus
> b cryptd iwlwifi media snd_hda_codec rtsx_pci_ms btrtl memstick snd_hda_core btbcm snd_hwdep btintel cfg80211 evdev thinkpad_acpi snd_pcm drm_kms_helper bluetooth nvram snd_timer pcspkr drm mei_me snd sg mei i2c_algo_bit ie31200_edac psmouse shpchp lpc_ich edac_core i2c_i801 serio_raw efivars soundcore rfkill wmi ac tpm_tis battery tpm video processor button coretemp parport_pc ppdev lp parport efivarfs autofs4 ext4 crc16 mbcache jbd2 sr_mod cdrom sd_mod rtsx_pci_sdmmc mmc_core crc32c_intel ahci libahci libata xhci_pci scsi_mod ehci_pci xhci_hcd ehci_hcd rtsx_pci e1000e mfd_core ptp usbcore pps_core thermal usb_common thermal_sys
>   CPU: 0 PID: 1054 Comm: Xorg Tainted: G        W       4.1.0-kaminsky+ #2
>   Hardware name: LENOVO 20BECTO1WW/20BECTO1WW, BIOS GMET66WW (2.14 ) 07/01/2014
>   task: ffff880231cd80c0 ti: ffff8800b560c000 task.ti: ffff8800b560c000
>   RIP: 0010:[<ffffffffa0418375>]  [<ffffffffa0418375>] drm_crtc_index+0x5/0x50 [drm]
>   RSP: 0018:ffff8800b560fbc0  EFLAGS: 00010246
>   RAX: 0000000000000000 RBX: ffff880036cd7c40 RCX: ffff880231deb818
>   RDX: ffff8800b5cb8338 RSI: 0000000000000000 RDI: 0000000000000000
>   RBP: ffff8800b5cb8320 R08: ffff8800b5cb8338 R09: ffff8800b5cb8000
>   R10: 000000003b9aca00 R11: 00000000000007d0 R12: 0000000000000006
>   R13: ffff880231deb800 R14: ffff8800b4dcba00 R15: ffff8800b52bc700
>   FS:  00007f6a248c0980(0000) GS:ffff88023e200000(0000) knlGS:0000000000000000
>   CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>   CR2: 0000000000000000 CR3: 00000002314c7000 CR4: 00000000001407f0
>   Stack:
>    ffffffffa03f4db3 ffff880231deb800 ffff8800b5cb8338 ffff8800b4dcba00
>    ffff8800b5cb8338 0000000000000000 ffff880036cd7c40 ffff8800b56aa000
>    ffff8800b560fd38 ffff880036cd7c40 0000000000000000 ffff8800b5b87000
>   Call Trace:
>    [<ffffffffa03f4db3>] ? drm_atomic_helper_check_modeset+0x2a3/0x900 [drm_kms_helper]
>    [<ffffffffa06e5744>] ? intel_modeset_compute_config+0x44/0xb30 [i915]
>    [<ffffffffa06d4809>] ? intel_modeset_setup_plane_state+0x79/0xe0 [i915]
>    [<ffffffffa06eb282>] ? intel_crtc_set_config+0x292/0x620 [i915]
>    [<ffffffffa0417dec>] ? drm_crtc_check_viewport+0x2c/0xe0 [drm]
>    [<ffffffffa041970e>] ? drm_mode_set_config_internal+0x5e/0xf0 [drm]
>    [<ffffffffa041d7eb>] ? drm_mode_setcrtc+0x17b/0x4d0 [drm]
>    [<ffffffffa040f202>] ? drm_ioctl+0x172/0x550 [drm]
>    [<ffffffff811c9cd3>] ? do_vfs_ioctl+0x2c3/0x4a0
>    [<ffffffff81077597>] ? recalc_sigpending+0x17/0x50
>    [<ffffffff811c9f26>] ? SyS_ioctl+0x76/0x90
>    [<ffffffff81542372>] ? entry_SYSCALL_64_fastpath+0x16/0x75
>   Code: c3 83 fe 0f ba 52 47 31 36 b8 58 52 31 35 0f 45 c2 48 83 c4 08 c3 b8 58 52 32 34 eb bd 66 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 <48> 8b 37 48 8b 86 80 03 00 00 48 81 c6 80 03 00 00 48 39 c6 48   RIP  [<ffffffffa0418375>] drm_crtc_index+0x5/0x50 [drm]
>    RSP <ffff8800b560fbc0>
>   CR2: 0000000000000000
>   ---[ end trace 4e7ded8d5c7f57eb ]---
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx at lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


More information about the Intel-gfx mailing list