Lockdep annotation introduced warn in VMD driver
Dan Williams
dan.j.williams at intel.com
Wed May 29 19:34:19 UTC 2024
Imre Deak wrote:
[..]
> > > Also Imre tried with 2 PCI patches together https://patchwork.freedesktop.org/series/134193/
> > > And still not good for those 4 systems (mtlp-9, bat-dg2-13/14 and bat-adlp-11) :
> > > https://intel-gfx-ci.01.org/tree/drm-tip/Trybot_134193v1/index.html?
> > > Dave, Dan, thoughts?
> >
> > Can you provide the dmesg from the failure system with the 2 patches applied please?
>
> For the above 4 machines, mtlp-9 not having the originally reported WARN
> (at pci.c:4886) only some other lockdep issue, while the other 3
> machines having both the originally reported one and the other lockdep
> issue:
> https://intel-gfx-ci.01.org/tree/drm-tip/Trybot_134193v1/bat-mtlp-9/boot0.txt
This one does not seem to implicate cfg_access_lock at all. I wonder if
you revert the lockdep annotation completely if it still fails. I.e.
this is a new lockdep report for v6.10-rc independent of this new
cfg_access_lock annotation.
> https://intel-gfx-ci.01.org/tree/drm-tip/Trybot_134193v1/bat-dg2-13/boot0.txt
> https://intel-gfx-ci.01.org/tree/drm-tip/Trybot_134193v1/bat-dg2-14/boot0.txt
> https://intel-gfx-ci.01.org/tree/drm-tip/Trybot_134193v1/bat-adlp-11/boot0.txt
These are all identical and are pointing out that vmd, via
pci_reset_bus(), has long been performing an unlocked secondary bus
reset that userspace could race and confuse the kernel.
I think the fix for that is below, but this is an increasingly spicy
level of change that gives me some pause, i.e. teach pci_bus_lock() to
lock the bridge itself:
-- 8< --
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 59e0949fb079..ac3999bc59e8 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -5442,6 +5442,7 @@ static void pci_bus_lock(struct pci_bus *bus)
{
struct pci_dev *dev;
+ pci_dev_lock(bus->self);
list_for_each_entry(dev, &bus->devices, bus_list) {
pci_dev_lock(dev);
if (dev->subordinate)
@@ -5459,6 +5460,7 @@ static void pci_bus_unlock(struct pci_bus *bus)
pci_bus_unlock(dev->subordinate);
pci_dev_unlock(dev);
}
+ pci_dev_unlock(bus->self);
}
/* Return 1 on successful lock, 0 on contention */
@@ -5466,6 +5468,7 @@ static int pci_bus_trylock(struct pci_bus *bus)
{
struct pci_dev *dev;
+ pci_dev_lock(bus->self);
list_for_each_entry(dev, &bus->devices, bus_list) {
if (!pci_dev_trylock(dev))
goto unlock;
@@ -5484,6 +5487,7 @@ static int pci_bus_trylock(struct pci_bus *bus)
pci_bus_unlock(dev->subordinate);
pci_dev_unlock(dev);
}
+ pci_dev_unlock(bus->self);
return 0;
}
More information about the Intel-gfx
mailing list