Lockdep annotation introduced warn in VMD driver
Saarinen, Jani
jani.saarinen at intel.com
Wed May 29 07:45:41 UTC 2024
Hi,
> -----Original Message-----
> From: Intel-gfx <intel-gfx-bounces at lists.freedesktop.org> On Behalf Of
> Nirmal Patel
> Sent: Wednesday, 29 May 2024 1.53
> To: Deak, Imre <imre.deak at intel.com>
> Cc: Jiang, Dave <dave.jiang at intel.com>; 李, 星辉 <korantli at tencent.com>;
> Jonathan Derrick <jonathan.derrick at linux.dev>; Bjorn Helgaas
> <bhelgaas at google.com>; linux-pci at vger.kernel.org; intel-
> gfx at lists.freedesktop.org
> Subject: Re: Lockdep annotation introduced warn in VMD driver
>
> On Tue, 28 May 2024 15:36:54 +0300
> Imre Deak <imre.deak at intel.com> wrote:
>
> > Hi,
> >
> > commit 7e89efc6e9e402839643cb297bab14055c547f07
> > Author: Dave Jiang <dave.jiang at intel.com>
> > Date: Thu May 2 09:57:31 2024 -0700
> >
> > PCI: Lock upstream bridge for pci_reset_function()
> >
> > introduced the WARN below in the VMD driver, see [1] for the full log.
> > Not sure if the annotation is incorrect or the VMD driver is missing
> > the lock, CC'ing VMD folks.
> >
> > --Imre
> Can you please provide repro steps and some background on the setup?
What you mean? Just take latest rc1 and see reports from our CI.
These are links from CI that can be seen from https://intel-gfx-ci.01.org/
These are currently issues on our drm-tip https://intel-gfx-ci.01.org/tree/drm-tip/index.html?
See those remaining red ones https://intel-gfx-ci.01.org/tree/drm-tip/igt@runner@aborted.html and
especially https://intel-gfx-ci.01.org/tree/drm-tip/index.html?hosts=mtlp-9%7Cdg2-13%7Cdg2-14%7Cadlp-11 .
Originally from 10 bad behaving system one regression was fixed at https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_14830/git-log-oneline.txt
that patch was merged to our core-for-ci to wait proper fix land during rc-2+ times and that left 5 systems behaving bad still. For those 4 (excluding apl-1) we have already fix at #1 .
For apl-1 we already have fix from Dan that is reported and rb'd too.
#1 https://lore.kernel.org/all/cfb50601-5d2a-4676-a958-1bd3f1b06654@intel.com/
Does that explain ?
Br,
Jani
>
> -nirmal
> >
> > https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_134112v1/bat-adlp-1
> > 1/boot0.txt
> >
> > <4>[ 17.354071] WARNING: CPU: 0 PID: 1 at drivers/pci/pci.c:4886
> > pci_bridge_secondary_bus_reset+0x5d/0x70 <4>[ 17.354095] Modules
> > linked in: <4>[ 17.354104] CPU: 0 PID: 1 Comm: swapper/0 Not
> > tainted 6.10.0-rc1-Patchwork_134112v1-gabaeae202dfb+ #1 <4>[
> > 17.354128] Hardware name: Intel Corporation Alder Lake Client
> > Platform/AlderLake-P LP5 RVP, BIOS RPLPFWI1.R00.4035.A00.2301200723
> > 01/20/2023 <4>[ 17.354153] RIP:
> > 0010:pci_bridge_secondary_bus_reset+0x5d/0x70 <4>[ 17.354167] Code:
> > c3 cc cc cc cc 48 89 ef 48 c7 c6 78 55 44 82 5d e9 d8 c6 ff ff 48 8d
> > bf 48 08 00 00 be ff ff ff ff e8 97 10 5f 00 85 c0 75 b5 <0f> 0b eb
> > b1 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 90 90 90 <4>[
> > 17.354199] RSP: 0000:ffffc90000097ca0 EFLAGS: 00010246 <4>[
> 17.354210]
> > RAX: 0000000000000000 RBX: ffff888105604000 RCX:
> > 0000000000000000 <4>[ 17.354224] RDX: 0000000080000000 RSI:
> > ffffffff82421c40 RDI: ffffffff82441c4c <4>[ 17.354238] RBP:
> > ffff888105601000 R08: 0000000000000001 R09: 0000000000000000 <4>[
> > 17.354251] R10: 0000000000000001 R11: ffff8881008c8040 R12:
> > 0000000000000000 <4>[ 17.354264] R13: 0000000000000020 R14:
> > 000000000000007f R15: ffff888105615c28 <4>[ 17.354283] FS:
> > 0000000000000000(0000) GS:ffff8882a6e00000(0000)
> > knlGS:0000000000000000 <4>[ 17.354313] CS: 0010 DS: 0000 ES: 0000
> > CR0: 0000000080050033 <4>[ 17.354334] CR2: ffff8882afbff000 CR3:
> > 000000000663a000 CR4: 0000000000f50ef0 <4>[ 17.354348] PKRU:
> > 55555554 <4>[ 17.354355] Call Trace: <4>[ 17.354361] <TASK> <4>[
> > 17.354367] ? __warn+0x8c/0x190 <4>[ 17.354380] ?
> > pci_bridge_secondary_bus_reset+0x5d/0x70 <4>[ 17.354392] ?
> > report_bug+0x1f8/0x200 <4>[ 17.354405] ? handle_bug+0x3c/0x70 <4>[
> > 17.354415] ? exc_invalid_op+0x18/0x70 <4>[ 17.354424] ?
> > asm_exc_invalid_op+0x1a/0x20 <4>[ 17.354438] ?
> > pci_bridge_secondary_bus_reset+0x5d/0x70 <4>[ 17.354451]
> > pci_reset_bus+0x1d8/0x270 <4>[ 17.354461] vmd_probe+0x778/0xa10
> > <4>[ 17.354474] pci_device_probe+0x95/0x120 <4>[ 17.354484]
> > really_probe+0xd9/0x370 <4>[ 17.354496] ?
> > __pfx___driver_attach+0x10/0x10 <4>[ 17.354505]
> > __driver_probe_device+0x73/0x150 <4>[ 17.354516]
> > driver_probe_device+0x19/0xa0 <4>[ 17.354525]
> > __driver_attach+0xb6/0x180 <4>[ 17.354534] ?
> > __pfx___driver_attach+0x10/0x10 <4>[ 17.354544]
> > bus_for_each_dev+0x77/0xd0 <4>[ 17.354555]
> > bus_add_driver+0x110/0x240 <4>[ 17.354566]
> > driver_register+0x5b/0x110 <4>[ 17.354575] ?
> > __pfx_vmd_drv_init+0x10/0x10 <4>[ 17.354587]
> > do_one_initcall+0x5c/0x2b0 <4>[ 17.354600]
> > kernel_init_freeable+0x18e/0x340 <4>[ 17.354612] ?
> > __pfx_kernel_init+0x10/0x10 <4>[ 17.354623] kernel_init+0x15/0x130
> > <4>[ 17.354631] ret_from_fork+0x2c/0x50 <4>[ 17.354641] ?
> > __pfx_kernel_init+0x10/0x10 <4>[ 17.354650]
> > ret_from_fork_asm+0x1a/0x30 <4>[ 17.354663] </TASK> <4>[
> > 17.354669] irq event stamp: 28577685 <4>[ 17.354677] hardirqs last
> > enabled at (28577693): [<ffffffff8117c060>]
> > console_unlock+0x110/0x120 <4>[ 17.354697] hardirqs last disabled
> > at (28577700): [<ffffffff8117c045>] console_unlock+0xf5/0x120 <4>[
> > 17.354713] softirqs last enabled at (28577176): [<ffffffff810df29c>]
> > handle_softirqs+0x2ec/0x3f0 <4>[ 17.354731] softirqs last disabled
> > at (28577167): [<ffffffff810dfa17>] irq_exit_rcu+0x87/0xc0 <4>[
> > 17.354747] ---[ end trace 0000000000000000 ]---
> >
> > <4>[ 17.487274] =====================================
> > <4>[ 17.487277] WARNING: bad unlock balance detected!
> > <4>[ 17.487279] 6.10.0-rc1-Patchwork_134112v1-gabaeae202dfb+ #1
> > Tainted: G W <4>[ 17.487282]
> > ------------------------------------- <4>[ 17.487284] swapper/0/1
> > is trying to release lock (10000:e1:00.0) at: <4>[ 17.487287]
> > [<ffffffff8176b377>] pci_cfg_access_unlock+0x57/0x60 <4>[
> > 17.487292] but there are no more locks to release! <4>[ 17.487294]
> > other info that might help us debug this:
> > <4>[ 17.487297] 2 locks held by swapper/0/1:
> > <4>[ 17.487299] #0: ffff888102c1c1b0 (&dev->mutex){....}-{3:3},
> > at: __driver_attach+0xab/0x180 <4>[ 17.487306] #1:
> > ffff8881056041b0 (&dev->mutex){....}-{3:3}, at:
> > pci_dev_trylock+0x19/0x50 <4>[ 17.487312] stack backtrace:
> > <4>[ 17.487314] CPU: 0 PID: 1 Comm: swapper/0 Tainted: G W
> > 6.10.0-rc1-Patchwork_134112v1-gabaeae202dfb+ #1 <4>[ 17.487318]
> > Hardware name: Intel Corporation Alder Lake Client
> > Platform/AlderLake-P LP5 RVP, BIOS RPLPFWI1.R00.4035.A00.2301200723
> > 01/20/2023 <4>[ 17.487322] Call Trace: <4>[ 17.487324] <TASK>
> > <4>[ 17.487325] dump_stack_lvl+0x82/0xd0 <4>[ 17.487329]
> > lock_release+0x20b/0x2d0 <4>[ 17.487334] pci_bus_unlock+0x25/0x40
> > <4>[ 17.487337] pci_reset_bus+0x1eb/0x270
> > <4>[ 17.487340] vmd_probe+0x778/0xa10
> > <4>[ 17.487344] pci_device_probe+0x95/0x120
> > <4>[ 17.487346] really_probe+0xd9/0x370
> > <4>[ 17.487349] ? __pfx___driver_attach+0x10/0x10
> > <4>[ 17.487352] __driver_probe_device+0x73/0x150
> > <4>[ 17.487354] driver_probe_device+0x19/0xa0
> > <4>[ 17.487357] __driver_attach+0xb6/0x180
> > <4>[ 17.487359] ? __pfx___driver_attach+0x10/0x10
> > <4>[ 17.487362] bus_for_each_dev+0x77/0xd0
> > <4>[ 17.487365] bus_add_driver+0x110/0x240
> > <4>[ 17.487369] driver_register+0x5b/0x110
> > <4>[ 17.487371] ? __pfx_vmd_drv_init+0x10/0x10
> > <4>[ 17.487374] do_one_initcall+0x5c/0x2b0
> > <4>[ 17.487378] kernel_init_freeable+0x18e/0x340
> > <4>[ 17.487381] ? __pfx_kernel_init+0x10/0x10
> > <4>[ 17.487384] kernel_init+0x15/0x130
> > <4>[ 17.487387] ret_from_fork+0x2c/0x50
> > <4>[ 17.487390] ? __pfx_kernel_init+0x10/0x10
> > <4>[ 17.487392] ret_from_fork_asm+0x1a/0x30
> > <4>[ 17.487396] </TASK>
> >
More information about the Intel-gfx
mailing list