[PATCH v4 5/7] PCI/IOV: Check that VF BAR fits within the reservation

Michał Winiarski michal.winiarski at intel.com
Wed Oct 30 11:43:19 UTC 2024


On Mon, Oct 28, 2024 at 11:56:04AM -0500, Bjorn Helgaas wrote:
> On Fri, Oct 25, 2024 at 11:50:36PM +0200, Michał Winiarski wrote:
> > VF MMIO resource reservation, either created by system firmware and
> > inherited by Linux PCI subsystem or created by the subsystem itself,
> > should contain enough space to fit the BAR of all SR-IOV Virtual
> > Functions that can potentially be created (total VFs supported by the
> > device).
> 
> I don't think "VF resource reservation ... should contain enough
> space" is really accurate or actionable.  It would be *nice* if the PF
> BAR is large enough to accommodate the largest supported VF BARs for
> all possible VFs, but if it doesn't, it's not really an error.  It's
> just a reflection of the fact that resource space is limited.

>From PCI perspective, you're right, IOV resources are optional, and it's
not really an error for PF device itself.
>From IOV perspective - we do need those resources to be able to create
VFs.

All I'm trying to say here, is that the context of the change is the
"success" case, where the VF BAR reservation was successfully assigned,
and the PF will be able to create VFs.
The case where there were not enough resources for VF BAR (and PF won't
be able to create VFs) remains unchanged.

> > However, that assumption only holds in an environment where VF BAR size
> > can't be modified.
> 
> There's no reason to assume anything about how many VF BARs fit.  The
> existing code should avoid enabling the requested nr_virtfn VFs if the
> PF doesn't have enough space -- I think that's what the "if
> (res->parent)" is supposed to be checking.
> 
> The fact that you need a change here makes me suspect that we're
> missing some resource claim (and corresponding res->parent update)
> elsewhere when resizing the VF BAR.

My understanding is that res->parent is only expressing that the
resource is assigned.
We don't really want to change that, the resource is still there and is
assigned - we just want to make sure that VF enabling fails if the
caller wants to enable more VFs than possible for current resource size.

Let's use an example. A device with a single BAR.
initial_vf_bar_size = X
total_vfs = 4
supported_vf_resizable_bar_sizes = X, 2X, 4X

With that - the initial underlying resource looks like this:
            +----------------------+
            |+--------------------+|
            ||                    ||
            |+--------------------+|
            |+--------------------+|
            ||                    ||
            |+--------------------+|
            |+--------------------+|
            ||                    ||
            |+--------------------+|
            |+--------------------+|
            ||                    ||
            |+--------------------+|
            +----------------------+
Its size is 4X, and it contains BAR for 4 VFs.
"resource_size >= vf_bar_size * num_vfs" is true for any num_vfs
Let's assume that there are enough resources to assign it.

Patch 4/7 allows to resize the entire resource (in addition to changing
the VF BAR size), which means that after calling:
pci_resize_resource() with size = 2X, the underlying resource will look
like this:
            +----------------------+ 
            |+--------------------+| 
            ||                    || 
            ||                    || 
            ||                    || 
            ||                    || 
            |+--------------------+| 
            |+--------------------+| 
            ||                    || 
            ||                    || 
            ||                    || 
            ||                    || 
            |+--------------------+| 
            |+--------------------+| 
            ||                    || 
            ||                    || 
            ||                    || 
            ||                    || 
            |+--------------------+| 
            |+--------------------+| 
            ||                    || 
            ||                    || 
            ||                    || 
            ||                    || 
            |+--------------------+| 
            +----------------------+ 
Its size is 8X, and it contains BAR for 4 VFs.
"resource_size >= vf_bar_size * num_vfs" is true for any num_vfs
It does require an extra 4X of MMIO resources, so this can fail in
resource constrained environment, even though the original 4X resource
was able to be assigned.

The following patch 6/7 allows to change VF BAR size without touching
the underlying reservation size.
After calling pci_iov_vf_bar_set_size() to 4X and enabling a single VF,
the underlying resource will look like this:
            +----------------------+ 
            |+--------------------+| 
            ||░░░░░░░░░░░░░░░░░░░░|| 
            ||░░░░░░░░░░░░░░░░░░░░|| 
            ||░░░░░░░░░░░░░░░░░░░░|| 
            ||░░░░░░░░░░░░░░░░░░░░|| 
            ||░░░░░░░░░░░░░░░░░░░░|| 
            ||░░░░░░░░░░░░░░░░░░░░|| 
            ||░░░░░░░░░░░░░░░░░░░░|| 
            ||░░░░░░░░░░░░░░░░░░░░|| 
            ||░░░░░░░░░░░░░░░░░░░░|| 
            ||░░░░░░░░░░░░░░░░░░░░|| 
            |+--------------------+| 
            +----------------------+ 
Its size is 4X, but since pci_iov_vf_bar_set_size() was called, it is no
longer able to accomodate 4 VFs.
"resource_size >= vf_bar_size * num_vfs" is only true for num_vfs = 1
and any attempts to create more than 1 VF should fail.
We don't need to worry about being MMIO resource constrained, no extra
MMIO resources are needed.

-Michał

> > Add an additional check that verifies that VF BAR for all enabled VFs
> > fits within the underlying reservation resource.
> > 
> > Signed-off-by: Michał Winiarski <michal.winiarski at intel.com>
> > ---
> >  drivers/pci/iov.c | 8 ++++++--
> >  1 file changed, 6 insertions(+), 2 deletions(-)
> > 
> > diff --git a/drivers/pci/iov.c b/drivers/pci/iov.c
> > index 79143c1bc7bb4..5de828e5a26ea 100644
> > --- a/drivers/pci/iov.c
> > +++ b/drivers/pci/iov.c
> > @@ -645,10 +645,14 @@ static int sriov_enable(struct pci_dev *dev, int nr_virtfn)
> >  
> >  	nres = 0;
> >  	for (i = 0; i < PCI_SRIOV_NUM_BARS; i++) {
> > +		int vf_bar_sz = pci_iov_resource_size(dev,
> > +						      pci_resource_to_iov(i));
> >  		bars |= (1 << pci_resource_to_iov(i));
> >  		res = &dev->resource[pci_resource_to_iov(i)];
> > -		if (res->parent)
> > -			nres++;
> > +		if (!res->parent || vf_bar_sz * nr_virtfn > resource_size(res))
> > +			continue;
> > +
> > +		nres++;
> >  	}
> >  	if (nres != iov->nres) {
> >  		pci_err(dev, "not enough MMIO resources for SR-IOV\n");
> > -- 
> > 2.47.0
> > 


More information about the Intel-xe mailing list