bisected: i915 modeset broken in ac9b8236551d1177fd07b56aef9b565d1864420d
Jani Nikula
jani.nikula at intel.com
Thu Jan 7 02:30:00 PST 2016
On Thu, 07 Jan 2016, Jani Nikula <jani.nikula at intel.com> wrote:
> On Wed, 06 Jan 2016, Meelis Roos <mroos at linux.ee> wrote:
>>> On Mon, Dec 14, 2015 at 03:31:09PM +0200, Meelis Roos wrote:
>>> > Between 4.4-rc3 and 4.4-rc4, i915 modesetting broke on my i5-2400 PC.
>>>
>>> That would seem to be SNB.
>>
>> Yes.
>>
>>> > Instead of seeing the new dense graphics mode, I see the last VGA text
>>> > lines and no X appears either.
>>>
>>> That's a bit weird. SNB has no power power wells, so only runtime PM
>>> could be a factor, but it should not kick in that fast during boot even
>>> if you enable it before loading the driver since we set the delay to 10
>>> seconds.
>>>
>>> And in any case the commit you list shouldn't really change anything
>>> for SNB. We used to grab a rpm reference for gmbus via
>>> intel_aux_display_runtime_get() and now we get it via the GMBUS power
>>> domain instead.
>>
>> I captured dmesg from failing boot, from system logs. gmbus has
>> something to do with it:
>>
>> [drm:i915_dump_device_info] i915 device info: gen=6, pciid=0x0102
>> rev=0x09 flags=need_gfx_hws,has_fbc,has_hotplug,has_llc,
>> [drm:intel_detect_pch] Found CougarPoint PCH
>> [drm] Memory usable by graphics device = 2048M
>> [drm:i915_gem_gtt_init] GMADR size = 256M
>> [drm:i915_gem_gtt_init] GTT stolen size = 32M
>> [drm:i915_gem_gtt_init] ppgtt mode: 1
>> [drm] Replacing VGA console driver
>> Console: switching to colour dummy device 80x25
>> BUG: unable to handle kernel NULL pointer dereference at (null)
>> IP: [<ffffffff81519c74>] __mutex_lock_slowpath+0x74/0x100
>> PGD 0
>> Oops: 0002 [#1] SMP
>> Modules linked in: i915(+) x86_pkg_temp_thermal kvm_intel kvm irqbypass video crc32c_intel i2c_algo_bit aesni_intel aes_x86_64 glue_helper lrw drm_kms_helper syscopyarea sysfillrect ablk_helper cryptd iTCO_wdt sysimgblt iTCO_vendor_support fb_sys_fops snd_hda_codec_realtek drm psmouse snd_hda_codec_generic e1000e xhci_pci xhci_hcd snd_hda_intel pcspkr snd_hda_codec snd_hwdep snd_hda_core snd_pcm_oss snd_mixer_oss i2c_i801 snd_pcm ehci_pci ehci_hcd nuvoton_cir usbcore snd_timer parport_pc ptp pps_core evdev rc_core parport snd usb_common soundcore tpm_tis tpm floppy lpc_ich mfd_core md_mod w83627ehf hwmon_vid coretemp hwmon eeprom i2c_core loop autofs4
>> CPU: 0 PID: 390 Comm: systemd-udevd Not tainted 4.4.0-rc2-00006-gac9b823 #185
>> Hardware name: /DQ67OW, BIOS SWQ6710H.86A.0066.2012.1105.1504 11/05/2012
>> task: ffff8800b7e48c40 ti: ffff880233420000 task.ti: ffff880233420000
>> RIP: 0010:[<ffffffff81519c74>] [<ffffffff81519c74>] __mutex_lock_slowpath+0x74/0x100
>> RSP: 0018:ffff880233423620 EFLAGS: 00010282
>> RAX: 0000000000000000 RBX: ffff8800b6a594f8 RCX: ffff8800b7e48c40
>> RDX: 0000000000000001 RSI: ffff8800b7e48c40 RDI: ffff8800b6a594fc
>> RBP: ffff880233423670 R08: 0000000000000000 R09: 0000000000000001
>> R10: ffff8800b6a507e8 R11: 0000000000000013 R12: ffff8800b7e48c40
>> R13: ffff8800b6a594fc R14: 00000000ffffffff R15: ffff8800b6a59500
>> FS: 00007f58846428c0(0000) GS:ffff88023e200000(0000) knlGS:0000000000000000
>> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> CR2: 0000000000000000 CR3: 0000000233424000 CR4: 00000000000406f0
>> Stack:
>> ffff8800b6a59500 0000000000000000 ffff8802353b7148 0000000000000206
>> ffff8800b6a50000 ffff8800b6a594f8 000000000000001d ffff8800b6a594f8
>> ffff8800b6a50000 ffff8800b6a50000 00000000fffeea06 ffffffff81519d1b
>> Call Trace:
>> [<ffffffff81519d1b>] ? mutex_lock+0x1b/0x30
>> [<ffffffffa0423859>] ? intel_display_power_get+0x29/0xe0 [i915]
>> [<ffffffffa04af048>] ? gmbus_xfer+0x38/0x680 [i915]
>> [<ffffffff81073113>] ? try_to_wake_up+0x43/0x320
>> [<ffffffffa0015cc6>] ? __i2c_transfer+0x106/0x380 [i2c_core]
>> [<ffffffffa0015fad>] ? i2c_transfer+0x6d/0xa0 [i2c_core]
>> [<ffffffffa0016185>] ? i2c_smbus_xfer_emulated+0x105/0x4c0 [i2c_core]
>> [<ffffffff81086f3e>] ? __wake_up_common+0x4e/0x90
>> [<ffffffff812a7cab>] ? idr_get_empty_slot+0x18b/0x390
>> [<ffffffffa0016658>] ? i2c_smbus_xfer+0x118/0x2e0 [i2c_core]
>> [<ffffffffa00168e5>] ? i2c_default_probe+0xc5/0x110 [i2c_core]
>> [<ffffffffa00154d9>] ? i2c_check_addr_busy+0x39/0x60 [i2c_core]
>> [<ffffffffa0018339>] ? i2c_do_add_adapter+0x159/0x260 [i2c_core]
>> [<ffffffffa0018440>] ? i2c_do_add_adapter+0x260/0x260 [i2c_core]
>> [<ffffffff81382b85>] ? bus_for_each_drv+0x55/0x90
>> [<ffffffffa0017fb6>] ? i2c_register_adapter+0x1c6/0x320 [i2c_core]
>> [<ffffffffa04afa80>] ? intel_setup_gmbus+0x220/0x310 [i915]
>
> intel_setup_gmbus registers the i2c adapters, which does transfers on
> the i2c bus on probe, and this happens before intel_power_domains_init
> which initializes the power domain lock.
>
> The bisect and backtrace make sense and are not mysterious at all.
>
> Not sure of the fix though, are we better off changing the init order,
> or making sure the probes don't happen or don't screw us up.
So I started wondering why we are not seeing this. To reproduce, looks
like you'll need to have an i2c driver with class I2C_CLASS_DDC for the
i2c detect (and the bug) to happen. In tree, the only ones seem to be
drivers/misc/eeprom/eeprom.c
drivers/staging/olpc_dcon/olpc_dcon.c
I presume you have one or the other.
No matter what, userspace can access the adapter right away when we
register it, so this needs to be fixed along the lines of [1].
BR,
Jani.
[1] http://patchwork.freedesktop.org/patch/msgid/1452157856-27360-1-git-send-email-daniel.vetter@ffwll.ch
>
> BR,
> Jani.
>
>
>
>> [<ffffffffa04ba1eb>] ? i915_driver_load+0x4eb/0x15e0 [i915]
>> [<ffffffffa026493c>] ? drm_dev_register+0x9c/0xb0 [drm]
>> [<ffffffffa0266c49>] ? drm_get_pci_dev+0x89/0x1d0 [drm]
>> [<ffffffff812d8c71>] ? pci_device_probe+0x81/0xe0
>> [<ffffffff81384857>] ? driver_probe_device+0x147/0x310
>> [<ffffffff81384a9b>] ? __driver_attach+0x7b/0x80
>> [<ffffffff81384a20>] ? driver_probe_device+0x310/0x310
>> [<ffffffff81382ada>] ? bus_for_each_dev+0x5a/0x90
>> [<ffffffff81383e34>] ? bus_add_driver+0x1a4/0x220
>> [<ffffffffa033c000>] ? 0xffffffffa033c000
>> [<ffffffff813851e7>] ? driver_register+0x57/0xc0
>> [<ffffffff810003b1>] ? do_one_initcall+0x81/0x1b0
>> [<ffffffff8116b601>] ? kmem_cache_alloc_trace+0x31/0x120
>> [<ffffffff8111fb77>] ? do_init_module+0x5b/0x1dc
>> [<ffffffff810c16c2>] ? load_module+0x1e52/0x2220
>> [<ffffffff810be3b0>] ? __symbol_put+0x50/0x50
>> [<ffffffff810c1c35>] ? SyS_finit_module+0x85/0x90
>> [<ffffffff8151b9db>] ? entry_SYSCALL_64_fastpath+0x16/0x6a
>> Code: e8 c2 1a 00 00 8b 03 83 f8 01 0f 84 92 00 00 00 48 8b 43 10 4c 8d 7b 08 48 89 63 10 41 be ff ff ff ff 4c 89 3c 24 48 89 44 24 08 <48> 89 20 4c 89 64 24 10 eb 19 49 c7 04 24 02 00 00 00 c6 43 04
>> RIP [<ffffffff81519c74>] __mutex_lock_slowpath+0x74/0x100
>> RSP <ffff880233423620>
>> CR2: 0000000000000000
>> ---[ end trace 5e2e7e41ffefe21d ]---
>>
>>
>>> So this bisect result is somewhat mysterious. A full dmesg with
>>> drm.debug=0xe with and without the offending patch reverted would be
>>> helpful. And might be best to attach those into a bug report
>>> (https://bugs.freedesktop.org/ -> DRI -> DRM/Intel) so that we don't
>>> lose track of them.
>>
>> Full dmesgs at https://bugs.freedesktop.org/show_bug.cgi?id=93608
>>
>>>
>>> Oh, are we even talking about HDMI/DVI here, or something else?
>>
>> DVI
>>
>>>
>>> >
>>> > I saw something similar on I865G but have not had time to check if it is
>>> > the same issue.
>>> >
>>> > ac9b8236551d1177fd07b56aef9b565d1864420d is the first bad commit
>>> > commit ac9b8236551d1177fd07b56aef9b565d1864420d
>>> > Author: Ville Syrjälä <ville.syrjala at linux.intel.com>
>>> > Date: Fri Nov 27 18:55:26 2015 +0200
>>> >
>>> > drm/i915: Introduce a gmbus power domain
>>> >
>>> > Currently the gmbus code uses intel_aux_display_runtime_get/put in an
>>> > effort to make sure the hardware is powered up sufficiently for gmbus.
>>> > That function only takes the runtime PM reference which on VLV/CHV/BXT
>>> > is not enough. We need the disp2d/pipe-a well on VLV/CHV and power well
>>> > 2 on BXT. So add a new power domnain for gmbus and kill off the now
>>> > unused intel_aux_display_runtime_get/put. And change
>>> > intel_hdmi_set_edid() to use the gmbus power domain too since that's all
>>> > we need there.
>>> >
>>> > Also toss in a BUILD_BUG_ON() to catch problems if we run out of
>>> > bits for power domains. We're already really close to the limit...
>>> >
>>> > [Patrik: Add gmbus string to debugfs output]
>>> >
>>> > Signed-off-by: Ville Syrjälä <ville.syrjala at linux.intel.com>
>>> > Reviewed-by: Patrik Jakobsson <patrik.jakobsson at linux.intel.com>
>>> > [Cherry-picked from drm-intel-next-queued f0ab43e6 (Imre)]
>>> > Signed-off-by: Imre Deak <imre.deak at intel.com>
>>> > Link: http://patchwork.freedesktop.org/patch/msgid/1448643329-18675-3-git-send-email-imre.deak@intel.com
>>> > Signed-off-by: Jani Nikula <jani.nikula at intel.com>
>>> >
>>> > :040000 040000 39379146d7e6dda8a4d5f8781ee3d307cce8c47e f4f09fae0485ad6263d31d425296fa9cd7de343b M drivers
>>> >
>>> >
>>> > --
>>> > Meelis Roos (mroos at linux.ee)
>>>
>>>
--
Jani Nikula, Intel Open Source Technology Center
More information about the dri-devel
mailing list