[PATCH 02/12] PCI/ACPI: Add PERST# Assertion Delay _DSM method
Bjorn Helgaas
helgaas at kernel.org
Wed Apr 2 18:48:47 UTC 2025
On Wed, Apr 02, 2025 at 07:51:29PM +0200, Rafael J. Wysocki wrote:
> On Wed, Apr 2, 2025 at 5:50 PM Bjorn Helgaas <helgaas at kernel.org> wrote:
> > On Wed, Apr 02, 2025 at 04:52:55PM +0200, Rafael J. Wysocki wrote:
> > > On Wed, Apr 2, 2025 at 4:21 PM Bjorn Helgaas <helgaas at kernel.org> wrote:
> > > > On Wed, Apr 02, 2025 at 01:06:42PM +0200, Rafael J. Wysocki wrote:
> > > > > On Tue, Apr 1, 2025 at 5:36 PM Anshuman Gupta <anshuman.gupta at intel.com> wrote:
> > > > > >
> > > > > > Implement _DSM Method 11 as per PCI firmware specs
> > > > > > section 4.6.11 Rev 3.3.
> > > >
> > > > > > +int pci_acpi_add_perst_assertion_delay(struct pci_dev *dev, u32 delay_us)
> > > > > > +{
> > > > > > + union acpi_object in_obj = {
> > > > > > + .integer.type = ACPI_TYPE_INTEGER,
> > > > > > + .integer.value = delay_us,
> > > > > > + };
> > > > > > +
> > > > > > + union acpi_object *out_obj;
> > > > > > + acpi_handle handle;
> > > > > > + int result, ret = -EINVAL;
> > > > > > +
> > > > > > + if (!dev || !ACPI_HANDLE(&dev->dev))
> > > > > > + return -EINVAL;
> > > > > > +
> > > > > > + handle = ACPI_HANDLE(&dev->dev);
> > > > > > +
> > > > > > + out_obj = acpi_evaluate_dsm_typed(handle, &pci_acpi_dsm_guid, 4,
> > > > >
> > > > > This is something I haven't noticed in the previous patch, but also
> > > > > applies to it.
> > > > >
> > > > > Why is rev 4 of the interface hard-coded here?
> > > >
> > > > Thanks for asking this because it's related to the whole _DSM revision
> > > > question that I don't understand.
> > > >
> > > > If we didn't use rev 4 here, what should we use? The PCI Firmware
> > > > spec, r3.3, sec 4.6.11, documents this interface and says "lowest
> > > > valid Revision ID value is 4", so that's the source of the 4.
> > >
> > > Well, the "lowest valid Revision ID" does not generally mean the "only
> > > valid Revision ID".
> >
> > True, but why should the kernel change from using the tested revision
> > ID to some other revision ID? What would be the benefit of using rev
> > 5?
>
> TBH I'm not exactly buying the "tested revision ID" concept because
> what does it really mean?
>
> If a given _DSM interface has been tested on one platform, does it
> necessarily mean that it will work on all of the other platforms
> implementing it?
No, all we can say is that this OS rev 4 code works on the platforms
we've tested. Other platforms might have their own defects.
My point is that this OS code was written to the rev 4 spec and has
been tested, and we shouldn't change the code or the revision it uses
unless we're fixing something.
> > > Now, I'm not sure how likely it is for the PCI Firmware spec to
> > > bump up the revision of this interface (I suppose that it will
> > > do so if a new function is defined), but even if it does so, the
> > > kernel will have to check both the new revision and rev 4
> > > anyway, in case the firmware doesn't know about the new
> > > revision.
> >
> > I guess the reason the OS would check both rev 5 and rev 4 would
> > be that a new platform might implement
> > DSM_PCI_PERST_ASSERTION_DELAY only at rev 5? I think this would
> > be a real mistake in terms of implementing something to the spec.
>
> This is a real possibility, however, because there's nothing to
> prevent this kind of spec interpretation.
>
> I didn't mean this, though.
>
> Say the kernel wants to use a function that is only defined at rev
> 5. It will invoke function 0 at rev 5 to see if this function is
> available, but then it may as well see if the other functions it
> wants to use are available at rev 5 because it will get this
> information from the same _DSM call. If the platform firmware
> responds that they all are implemented, why won't the kernel use
> them all at rev 5?
This is an unnecessary change, since the OS tested its rev 4 code and
now we're saying the OS should use rev 5 instead, which the OS did not
test. The reason rev 5 exists at all is probably that some completely
unrelated new function was added, and somebody decided it should be
rev 5.
I guess function 0 *could* have been defined to answer "yes/no" about
whether a single (function, revision) is implemented. Then we
probably wouldn't be having this conversation.
But we do get an entire bitmask of implemented functions back from
function 0, which allows the possibility of using the same revision
for all the functions. I suppose we could have a boot-time function
that calls function 0 with rev 0, 1, 2, ..., until it gets an empty
bitmask, and saves the highest rev with a non-empty mask.
In PCI we don't do that; instead, we call acpi_check_dsm() before
every acpi_evaluate_dsm(). The main reason is that I don't think it's
safe to change the function X rev just because function Y was added
later with a higher revision.
> > The spec documents DSM_PCI_PERST_ASSERTION_DELAY rev 4. Some
> > platforms implemented and tested DSM_PCI_PERST_ASSERTION_DELAY rev
> > 4. Linux added and tested support for
> > DSM_PCI_PERST_ASSERTION_DELAY rev 4. I propose new platforms
> > should also implement and test DSM_PCI_PERST_ASSERTION_DELAY rev
> > 4.
> >
> > If a new platform implements only DSM_PCI_PERST_ASSERTION_DELAY
> > rev 5, there is no actual documentation for that rev other than
> > the spec assertion that "rev 5 must support all functions defined
> > by previous revs of the UUID." IMO this is nonsense. The
> > platform might have no need to implement all the functions defined
> > for the UUID.
>
> Sure. That's why function 0 should always be used.
Yes. But the requirement that rev 5 must support all functions ever
previously documented for the UUID still makes no sense to me. I
don't think there's any requirement that a platform implement ALL of
the documented PCI functions.
Maybe the intent is that this only applies to a given platform, i.e.,
that new firmware for that platform must continue to support all
functions and revisions that were ever supported on that platform?
That seems obvious to me and hardly worth mentioning.
> > Even if the platform makes all its functions valid for "the lowest
> > valid Revision ID" through the "current Revision ID", there's
> > nothing other than the spec assertion to guarantee that they all
> > behave the same. And dealing with multiple revisions that are
> > "supposed" to be the same just makes work and risk for the OS.
>
> You seem to be questioning the interface versioning at large by
> saying that there is no real guarantee of backwards compatibility
> between consecutive interface revisions on the same platform.
That's exactly what I'm saying, although I think problems are more
likely across different platforms. There's the requirement in the
spec, but that's just words on a page. There's no way to enforce or
validate it.
> Presumably, though, the interface is adhering to some specification
> which defines the behavior of the functions (and the set of
> available functions in the first place) for all of the valid
> revisions of it. The OS and the platform firmware both follow the
> same spec and so they should be mutually compatible. If this
> particular spec defines rev 5 to be an exact superset of rev 4,
> there is no reason to expect anything else.
I don't *expect* rev 5 to be different. But as a user of it, why
would I change working, tested code that is not broken?
The PCI DPC function 0xc is an example where rev 5 (per ECN) and rev 6
(per r3.3) are not compatible.
If the OS implemented rev 5, then learns via function 0 that function
0xc is also supported at rev 6, and starts using the same OS code with
rev 6, the OS is broken.
Bjorn
More information about the Intel-xe
mailing list