[Bug 93294] New: [BAT SKL] lockdep splat

Tue Dec 8 01:55:23 PST 2015

https://bugs.freedesktop.org/show_bug.cgi?id=93294

            Bug ID: 93294
           Summary: [BAT SKL] lockdep splat
           Product: DRI
           Version: XOrg git
          Hardware: Other
                OS: All
            Status: NEW
          Severity: normal
          Priority: medium
         Component: DRM/Intel
          Assignee: intel-gfx-bugs at lists.freedesktop.org
          Reporter: daniel at ffwll.ch
        QA Contact: intel-gfx-bugs at lists.freedesktop.org
                CC: intel-gfx-bugs at lists.freedesktop.org

Looks like locking recursion around cpu hotplug (so not directly a i915
problem), but it's causing fails in our CI. We need at least a hack to get the
lockdep splat shut up, but probably need to chase this with upstream.

Affects all skl CI platforms: skl-i5k-2 and skl-i7k-2

Only one testcase seems to hit this reliably:
igt at kms_pipe_crc_basic@suspend-read-crc-pipe-b

Full dmesg splat:

[  185.245810] Suspending console(s) (use no_console_suspend to debug)
[  185.250126] sd 3:0:0:0: [sda] Synchronizing SCSI cache
[  185.250333] sd 3:0:0:0: [sda] Stopping disk

[  185.556664] ======================================================
[  185.556664] [ INFO: possible circular locking dependency detected ]
[  185.556665] 4.4.0-rc4-gfxbench+ #1 Tainted: G     U  W      
[  185.556665] -------------------------------------------------------
[  185.556666] rtcwake/6380 is trying to acquire lock:
[  185.556670]  (&policy->rwsem){+++++.}, at: [<ffffffff81632dee>]
cpufreq_offline_prepare+0x7e/0x200
[  185.556670] 
but task is already holding lock:
[  185.556673]  (cpu_hotplug.lock#2){+.+.+.}, at: [<ffffffff81074cfd>]
cpu_hotplug_begin+0x6d/0xc0
[  185.556673] 
which lock already depends on the new lock.

[  185.556673] 
the existing dependency chain (in reverse order) is:
[  185.556675] 
-> #1 (cpu_hotplug.lock#2){+.+.+.}:
[  185.556677]        [<ffffffff810c85e3>] lock_acquire+0xc3/0x1d0
[  185.556679]        [<ffffffff81795962>] mutex_lock_nested+0x62/0x3b0
[  185.556681]        [<ffffffff814878ed>]
acpi_processor_ppc_notifier+0x3e/0xa7
[  185.556683]        [<ffffffff810970a9>] notifier_call_chain+0x39/0xa0
[  185.556684]        [<ffffffff810974a8>]
__blocking_notifier_call_chain+0x48/0x70
[  185.556685]        [<ffffffff810974e1>]
blocking_notifier_call_chain+0x11/0x20
[  185.556686]        [<ffffffff81633200>] cpufreq_set_policy+0x70/0x340
[  185.556687]        [<ffffffff81633c4f>] cpufreq_init_policy+0x6f/0xc0
[  185.556688]        [<ffffffff81634100>] cpufreq_online+0x290/0x7d0
[  185.556689]        [<ffffffff816346ed>] cpufreq_add_dev+0x5d/0x90
[  185.556690]        [<ffffffff815195e7>] subsys_interface_register+0xa7/0xf0
[  185.556692]        [<ffffffff81633946>] cpufreq_register_driver+0x106/0x1f0
[  185.556693]        [<ffffffff81da9aa6>] intel_pstate_init+0x37f/0x45c
[  185.556695]        [<ffffffff810003de>] do_one_initcall+0xae/0x1d0
[  185.556697]        [<ffffffff81d63fcf>] kernel_init_freeable+0x113/0x19b
[  185.556698]        [<ffffffff81791c69>] kernel_init+0x9/0xe0
[  185.556699]        [<ffffffff8179a97f>] ret_from_fork+0x3f/0x70
[  185.556700] 
-> #0 (&policy->rwsem){+++++.}:
[  185.556702]        [<ffffffff810c7af6>] __lock_acquire+0x1cf6/0x1f70
[  185.556703]        [<ffffffff810c85e3>] lock_acquire+0xc3/0x1d0
[  185.556704]        [<ffffffff81797e2f>] down_write+0x3f/0x70
[  185.556705]        [<ffffffff81632dee>] cpufreq_offline_prepare+0x7e/0x200
[  185.556706]        [<ffffffff81634685>] cpufreq_cpu_callback+0x45/0x50
[  185.556707]        [<ffffffff810970a9>] notifier_call_chain+0x39/0xa0
[  185.556709]        [<ffffffff81097119>] __raw_notifier_call_chain+0x9/0x10
[  185.556709]        [<ffffffff81074e2e>] _cpu_down+0x9e/0x340
[  185.556710]        [<ffffffff81075519>] disable_nonboot_cpus+0xc9/0x350
[  185.556712]        [<ffffffff810ceb81>]
suspend_devices_and_enter+0x5a1/0xb50
[  185.556713]        [<ffffffff810cf673>] pm_suspend+0x543/0x8d0
[  185.556714]        [<ffffffff810cd8f7>] state_store+0x77/0xe0
[  185.556716]        [<ffffffff813e199f>] kobj_attr_store+0xf/0x20
[  185.556717]        [<ffffffff8123cfe0>] sysfs_kf_write+0x40/0x50
[  185.556718]        [<ffffffff8123c3cc>] kernfs_fop_write+0x13c/0x180
[  185.556720]        [<ffffffff811c2323>] __vfs_write+0x23/0xe0
[  185.556721]        [<ffffffff811c2b34>] vfs_write+0xa4/0x190
[  185.556722]        [<ffffffff811c3774>] SyS_write+0x44/0xb0
[  185.556723]        [<ffffffff8179a5db>] entry_SYSCALL_64_fastpath+0x16/0x73
[  185.556724] 
other info that might help us debug this:

[  185.556724]  Possible unsafe locking scenario:

[  185.556724]        CPU0                    CPU1
[  185.556724]        ----                    ----
[  185.556725]   lock(cpu_hotplug.lock#2);
[  185.556726]                                lock(&policy->rwsem);
[  185.556727]                                lock(cpu_hotplug.lock#2);
[  185.556727]   lock(&policy->rwsem);
[  185.556727] 
 *** DEADLOCK ***

[  185.556728] 8 locks held by rtcwake/6380:
[  185.556730]  #0:  (sb_writers#6){.+.+.+}, at: [<ffffffff811c5d34>]
__sb_start_write+0xd4/0xf0
[  185.556732]  #1:  (&of->mutex){+.+.+.}, at: [<ffffffff8123c2f1>]
kernfs_fop_write+0x61/0x180
[  185.556733]  #2:  (s_active#229){.+.+.+}, at: [<ffffffff8123c2f9>]
kernfs_fop_write+0x69/0x180
[  185.556735]  #3:  (pm_mutex){+.+...}, at: [<ffffffff810cf1c7>]
pm_suspend+0x97/0x8d0
[  185.556738]  #4:  (acpi_scan_lock){+.+.+.}, at: [<ffffffff81459363>]
acpi_scan_lock_acquire+0x12/0x14
[  185.556739]  #5:  (cpu_add_remove_lock){+.+.+.}, at: [<ffffffff81075474>]
disable_nonboot_cpus+0x24/0x350
[  185.556741]  #6:  (cpu_hotplug.lock){++++++}, at: [<ffffffff81074c90>]
cpu_hotplug_begin+0x0/0xc0
[  185.556743]  #7:  (cpu_hotplug.lock#2){+.+.+.}, at: [<ffffffff81074cfd>]
cpu_hotplug_begin+0x6d/0xc0
[  185.556743] 
stack backtrace:
[  185.556744] CPU: 1 PID: 6380 Comm: rtcwake Tainted: G     U  W      
4.4.0-rc4-gfxbench+ #1
[  185.556745] Hardware name: System manufacturer System Product
Name/Z170M-PLUS, BIOS 0408 08/28/2015
[  185.556747]  ffffffff825d90b0 ffff880241523940 ffffffff813df80c
ffffffff8260c9e0
[  185.556748]  ffff880241523980 ffffffff810c441c ffff8802415239e0
ffff880244a44e80
[  185.556749]  ffff880244a45760 ffff880244a45788 0000000000000008
ffff880244a45788
[  185.556749] Call Trace:
[  185.556750]  [<ffffffff813df80c>] dump_stack+0x4e/0x82
[  185.556752]  [<ffffffff810c441c>] print_circular_bug+0x1fc/0x310
[  185.556753]  [<ffffffff810c7af6>] __lock_acquire+0x1cf6/0x1f70
[  185.556754]  [<ffffffff810c85e3>] lock_acquire+0xc3/0x1d0
[  185.556755]  [<ffffffff81632dee>] ? cpufreq_offline_prepare+0x7e/0x200
[  185.556756]  [<ffffffff81797e2f>] down_write+0x3f/0x70
[  185.556757]  [<ffffffff81632dee>] ? cpufreq_offline_prepare+0x7e/0x200
[  185.556758]  [<ffffffff81632dee>] cpufreq_offline_prepare+0x7e/0x200
[  185.556759]  [<ffffffff81634685>] cpufreq_cpu_callback+0x45/0x50
[  185.556760]  [<ffffffff810970a9>] notifier_call_chain+0x39/0xa0
[  185.556762]  [<ffffffff81097119>] __raw_notifier_call_chain+0x9/0x10
[  185.556762]  [<ffffffff81074e2e>] _cpu_down+0x9e/0x340
[  185.556763]  [<ffffffff810d24cf>] ? vprintk_emit+0x2cf/0x610
[  185.556764]  [<ffffffff810d250b>] ? vprintk_emit+0x30b/0x610
[  185.556766]  [<ffffffff810d294a>] ? vprintk_default+0x1a/0x20
[  185.556767]  [<ffffffff810c30b9>] ? __lock_is_held+0x49/0x70
[  185.556768]  [<ffffffff81075519>] disable_nonboot_cpus+0xc9/0x350
[  185.556769]  [<ffffffff810ceb81>] suspend_devices_and_enter+0x5a1/0xb50
[  185.556770]  [<ffffffff810c30b9>] ? __lock_is_held+0x49/0x70
[  185.556771]  [<ffffffff810cf673>] pm_suspend+0x543/0x8d0
[  185.556772]  [<ffffffff810cd8f7>] state_store+0x77/0xe0
[  185.556773]  [<ffffffff813e199f>] kobj_attr_store+0xf/0x20
[  185.556774]  [<ffffffff8123cfe0>] sysfs_kf_write+0x40/0x50
[  185.556774]  [<ffffffff8123c3cc>] kernfs_fop_write+0x13c/0x180
[  185.556775]  [<ffffffff811c2323>] __vfs_write+0x23/0xe0
[  185.556776]  [<ffffffff810c1ad2>] ? percpu_down_read+0x52/0x90
[  185.556777]  [<ffffffff811c5d34>] ? __sb_start_write+0xd4/0xf0
[  185.556778]  [<ffffffff811c5d34>] ? __sb_start_write+0xd4/0xf0
[  185.556779]  [<ffffffff811c2b34>] vfs_write+0xa4/0x190
[  185.556780]  [<ffffffff811e091a>] ? __fget_light+0x6a/0x90
[  185.556781]  [<ffffffff811c3774>] SyS_write+0x44/0xb0
[  185.556782]  [<ffffffff8179a5db>] entry_SYSCALL_64_fastpath+0x16/0x73
[  185.573654] Broke affinity for irq 121
[  185.582651] Broke affinity for irq 121
[  185.592586]  cache: parent cpu1 should not be sleeping
[  185.598049]  cache: parent cpu2 should not be sleeping
[  185.603869]  cache: parent cpu3 should not be sleeping
[  185.643441] snd_hda_intel 0000:00:1f.3: spurious response 0x0:0x0, last
cmd=0x0f0000
[  185.643445] snd_hda_intel 0000:00:1f.3: spurious response 0x0:0x0, last
cmd=0x0f0000
[  185.643448] snd_hda_intel 0000:00:1f.3: spurious response 0x0:0x0, last
cmd=0x0f0000
[  185.643451] snd_hda_intel 0000:00:1f.3: spurious response 0x0:0x0, last
cmd=0x0f0000
[  185.643453] snd_hda_intel 0000:00:1f.3: spurious response 0x0:0x0, last
cmd=0x0f0000
[  185.643456] snd_hda_intel 0000:00:1f.3: spurious response 0x0:0x0, last
cmd=0x0f0000
[  185.643458] snd_hda_intel 0000:00:1f.3: spurious response 0x0:0x0, last
cmd=0x0f0000
[  185.643461] snd_hda_intel 0000:00:1f.3: spurious response 0x0:0x0, last
cmd=0x0f0000
[  185.643464] snd_hda_intel 0000:00:1f.3: spurious response 0x0:0x0, last
cmd=0x0f0000
[  185.661028] sd 3:0:0:0: [sda] Starting disk
[  185.787456] snd_hda_intel 0000:00:1f.3: Codec #2 probe error; disabling
it...
[  186.107210] ------------[ cut here ]------------
[  186.107239] WARNING: CPU: 1 PID: 6375 at
drivers/gpu/drm/i915/intel_display.c:13778 skl_max_scale.part.87+0x67/0x70
[i915]()
[  186.107240] WARN_ON_ONCE(!crtc_clock || cdclk < crtc_clock)
[  186.107241] Modules linked in:
[  186.107242]  snd_hda_intel i915 ax88179_178a usbnet mii snd_hda_codec_hdmi
snd_hda_codec_realtek snd_hda_codec_generic x86_pkg_temp_thermal
intel_powerclamp coretemp crct10dif_pclmul crc32_pclmul snd_hda_codec snd_hwdep
snd_hda_core snd_pcm mei_me mei i2c_hid e1000e ptp pps_core [last unloaded:
i915]
[  186.107254] CPU: 1 PID: 6375 Comm: kms_pipe_crc_ba Tainted: G     U  W      
4.4.0-rc4-gfxbench+ #1
[  186.107255] Hardware name: System manufacturer System Product
Name/Z170M-PLUS, BIOS 0408 08/28/2015
[  186.107256]  ffffffffa034bab0 ffff8802481fba40 ffffffff813df80c
ffff8802481fba88
[  186.107258]  ffff8802481fba78 ffffffff810747c1 0000000000010000
ffff88003680cf38
[  186.107260]  ffff8802497d8000 ffff88003680cf38 ffffffffa032d620
ffff8802481fbad8
[  186.107262] Call Trace:
[  186.107265]  [<ffffffff813df80c>] dump_stack+0x4e/0x82
[  186.107267]  [<ffffffff810747c1>] warn_slowpath_common+0x81/0xc0
[  186.107268]  [<ffffffff81074847>] warn_slowpath_fmt+0x47/0x50
[  186.107278]  [<ffffffffa02d48a7>] skl_max_scale.part.87+0x67/0x70 [i915]
[  186.107288]  [<ffffffffa02d4958>] intel_check_primary_plane+0xa8/0xc0 [i915]
[  186.107296]  [<ffffffffa02c6ab9>] intel_plane_atomic_check+0x129/0x1f0
[i915]
[  186.107299]  [<ffffffff814ea153>] drm_atomic_helper_check_planes+0x43/0x1d0
[  186.107309]  [<ffffffffa02e00d5>] intel_atomic_check+0x2f5/0x1090 [i915]
[  186.107311]  [<ffffffff815001db>] ? drm_property_unreference_blob+0x4b/0xf0
[  186.107312]  [<ffffffff81500101>] ? drm_property_free_blob+0x21/0x60
[  186.107314]  [<ffffffff8150dc72>] drm_atomic_check_only+0x142/0x5e0
[  186.107315]  [<ffffffff8150d7c0>] ?
drm_atomic_set_mode_prop_for_crtc+0x20/0x110
[  186.107316]  [<ffffffff8150e122>] drm_atomic_commit+0x12/0x50
[  186.107318]  [<ffffffff814ead55>] drm_atomic_helper_set_config+0x75/0xb0
[  186.107319]  [<ffffffff814fcc60>] drm_mode_set_config_internal+0x60/0x110
[  186.107321]  [<ffffffff81501876>] drm_mode_setcrtc+0x186/0x4e0
[  186.107322]  [<ffffffff814f3b5d>] drm_ioctl+0x13d/0x590
[  186.107324]  [<ffffffff815016f0>] ? drm_mode_setplane+0x1b0/0x1b0
[  186.107326]  [<ffffffff811d4eac>] do_vfs_ioctl+0x2fc/0x550
[  186.107327]  [<ffffffff8118d554>] ? do_munmap+0x254/0x3d0
[  186.107329]  [<ffffffff8118d71a>] ? vm_munmap+0x4a/0x60
[  186.107330]  [<ffffffff811e091a>] ? __fget_light+0x6a/0x90
[  186.107331]  [<ffffffff811d513c>] SyS_ioctl+0x3c/0x70
[  186.107333]  [<ffffffff8179a5db>] entry_SYSCALL_64_fastpath+0x16/0x73
[  186.107334] ---[ end trace 29c112d5b40b86c8 ]---

Note that the cdclk splat is already tracked in bug #93293

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are on the CC list for the bug.
You are the assignee for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.freedesktop.org/archives/intel-gfx-bugs/attachments/20151208/d8b5c859/attachment-0001.html>