<!DOCTYPE html><html><head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
</head>
<body>
Am 10.10.24 um 10:59 schrieb Michał Winiarski:<br>
<blockquote type="cite" cite="mid:47iala6cl7tks7tw3wcrxm43y4xl4w24u6gfw5tekdcuabhndx@t37nyrqysrb5">
<pre class="moz-quote-pre" wrap="">On Fri, Sep 20, 2024 at 12:07:34PM +0200, Christian König wrote:
</pre>
<blockquote type="cite">
<pre class="moz-quote-pre" wrap="">Am 20.09.24 um 00:35 schrieb Michał Winiarski:
</pre>
<blockquote type="cite">
<pre class="moz-quote-pre" wrap="">[SNIP]
@@ -487,6 +567,11 @@ static ssize_t sriov_numvfs_store(struct device *dev,
goto exit;
}
+ for (i = 0; i < PCI_SRIOV_NUM_BARS; i++) {
+ if (pdev->sriov->rebar_extend[i])
+ pci_iov_resource_do_extend(pdev, i + PCI_IOV_RESOURCES, num_vfs);
+ }
+
</pre>
</blockquote>
<pre class="moz-quote-pre" wrap="">
That sounds like a really bad idea to me.
Basically the suggestion is here that the PCI subsystem should silently
extend and shrink the VF BARs when the number of VFs change?
</pre>
</blockquote>
<pre class="moz-quote-pre" wrap="">
Why do you think it's a bad idea? Everything is under PCI subsystem
control and the driver in charge has to explicitly opt-in to this
behavior on a per-BAR basis.</pre>
</blockquote>
<br>
And exactly that's a bad idea. The PCI subsystem shouldn't control
this, the driver should.<br>
<br>
At least for some devices we have tons of interactions with ACPI and
EFI. Only the driver does know for example when platform drivers
which might be in the way for a resize have been unloaded.<br>
<br>
From the past experience BAR resize should only be triggered by the
driver and never from the PCI subsystem while scanning the bus or
probing devices.<br>
<br>
<span style="white-space: pre-wrap">
</span>
<blockquote type="cite" cite="mid:47iala6cl7tks7tw3wcrxm43y4xl4w24u6gfw5tekdcuabhndx@t37nyrqysrb5">
<blockquote type="cite">
<pre class="moz-quote-pre" wrap="">Bjorn has the last word on that but I would say that instead the driver
owning the PCIe device as hypervisor should resize the VF BARs to a desired
size and that in turn restricts the number of VFs you can enable.
</pre>
</blockquote>
<pre class="moz-quote-pre" wrap="">
Then the PCI subsystem would silently change the driver_max_VFs (or new
variable, as driver_max_VFs is under PF control, so it's either new var
or checking VF BAR size in pci_sriov_set_totalvfs).</pre>
</blockquote>
<br>
Nope, the PCI subsystem should not magically adjust anything.<br>
<br>
What should happen instead is that the driver would call
pci_enable_sriov() with the number of virtual functions to enable
and the PCI subsystem then validates that number and return -EINVAL
or -ENOSPC if it won't work.<br>
<br>
<blockquote type="cite" cite="mid:47iala6cl7tks7tw3wcrxm43y4xl4w24u6gfw5tekdcuabhndx@t37nyrqysrb5">
<pre class="moz-quote-pre" wrap="">It also means that we have to do the maths to calculate the new VF limit
in both PCI subsystem and the caller.</pre>
</blockquote>
<br>
Well the point is that those calculations are different.<br>
<br>
What the subsystem does is to validate if with the number of
requested virtual functions the necessary resources will fit into
the allocate space.<br>
<br>
What the driver does previously is to either change the allocate
space or calculate the other way around and determine the maximum
virtual functions from the space available.<br>
<br>
<blockquote type="cite" cite="mid:47iala6cl7tks7tw3wcrxm43y4xl4w24u6gfw5tekdcuabhndx@t37nyrqysrb5">
<pre class="moz-quote-pre" wrap="">We can go this route as well - I just think it's cleaner to keep this
all under PCI subsystem control.</pre>
</blockquote>
<br>
I think that would be much cleaner, especially the PCI subsystem
shouldn't adjust any values given from the driver or even more
general overrule decisions the driver made.<br>
<br>
Instead proper error codes should be returned if some values don't
make sense or the subsystem isn't able to move around BARs currently
in use etc...<br>
<br>
Regards,<br>
Christian.<br>
<br>
<blockquote type="cite" cite="mid:47iala6cl7tks7tw3wcrxm43y4xl4w24u6gfw5tekdcuabhndx@t37nyrqysrb5">
<pre class="moz-quote-pre" wrap="">
I'll keep the current behavior in v3, but I'm open to changing it.
Thanks,
-Michał
</pre>
<blockquote type="cite">
<pre class="moz-quote-pre" wrap="">
Regards,
Christian.
</pre>
<blockquote type="cite">
<pre class="moz-quote-pre" wrap=""> ret = pdev->driver->sriov_configure(pdev, num_vfs);
if (ret < 0)
goto exit;
@@ -881,8 +966,13 @@ static int sriov_init(struct pci_dev *dev, int pos)
static void sriov_release(struct pci_dev *dev)
{
+ int i;
+
BUG_ON(dev->sriov->num_VFs);
+ for (i = 0; i < PCI_SRIOV_NUM_BARS; i++)
+ pci_iov_resource_do_restore(dev, i + PCI_IOV_RESOURCES);
+
if (dev != dev->sriov->dev)
pci_dev_put(dev->sriov->dev);
diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index e763b3fd4c7a2..47ed2633232aa 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -385,6 +385,7 @@ struct pci_sriov {
u16 subsystem_vendor; /* VF subsystem vendor */
u16 subsystem_device; /* VF subsystem device */
resource_size_t barsz[PCI_SRIOV_NUM_BARS]; /* VF BAR size */
+ bool rebar_extend[PCI_SRIOV_NUM_BARS]; /* Resize VF BAR */
bool drivers_autoprobe; /* Auto probing of VFs by driver */
};
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 4cf89a4b4cbcf..c007119da7b3d 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -2364,6 +2364,7 @@ int pci_sriov_set_totalvfs(struct pci_dev *dev, u16 numvfs);
int pci_sriov_get_totalvfs(struct pci_dev *dev);
int pci_sriov_configure_simple(struct pci_dev *dev, int nr_virtfn);
resource_size_t pci_iov_resource_size(struct pci_dev *dev, int resno);
+int pci_iov_resource_extend(struct pci_dev *dev, int resno, bool enable);
void pci_vf_drivers_autoprobe(struct pci_dev *dev, bool probe);
/* Arch may override these (weak) */
@@ -2416,6 +2417,8 @@ static inline int pci_sriov_get_totalvfs(struct pci_dev *dev)
#define pci_sriov_configure_simple NULL
static inline resource_size_t pci_iov_resource_size(struct pci_dev *dev, int resno)
{ return 0; }
+static inline void pci_iov_resource_extend(struct pci_dev *dev, int resno, bool enable)
+{ return -ENODEV; }
static inline void pci_vf_drivers_autoprobe(struct pci_dev *dev, bool probe) { }
#endif
</pre>
</blockquote>
<pre class="moz-quote-pre" wrap="">
</pre>
</blockquote>
</blockquote>
<br>
</body>
</html>