[PATCH 01/12] PCI/ACPI: Add D3cold Aux Power Limit_DSM method
Rafael J. Wysocki
rafael at kernel.org
Wed Apr 2 10:59:06 UTC 2025
On Tue, Apr 1, 2025 at 8:25 PM Bjorn Helgaas <helgaas at kernel.org> wrote:
>
> On Tue, Apr 01, 2025 at 09:02:14PM +0530, Anshuman Gupta wrote:
> > Implement _DSM method 10 and _DSM Method 11 as per PCI firmware specs
> > section 4.6.10 Rev 3.3.
>
> Thanks for splitting this into two patches. But I think now this only
> implements function 10 (0x0a), so this sentence needs to be updated.
>
> I would write this consistently as "0x0a" or "0Ah" to match the spec
> description. I don't think the spec ever uses "10".
>
> > Note that this implementation assumes only a single device below the
> > Downstream Port will request for Aux Power Limit under a given
> > Root Port because it does not track and aggregate requests
> > from all child devices below the Downstream Port as required
> > by Section 4.6.10 Rev 3.3.
Why is it regarded as compliant, then?
Request aggregation is a known problem that has been addressed for a
couple of times (at least) in the kernel. Why is it too hard to
address it here?
> > One possible mitigation would be only allowing only first PCIe
> > Non-Bridge Endpoint Function 0 driver to call_DSM method 10.
What about topologies where there is a switch with multiple downstream
ports below the Root Port?
> >
> > Signed-off-by: Varun Gupta <varun.gupta at intel.com>
> > Signed-off-by: Badal Nilawar <badal.nilawar at intel.com>
> > Signed-off-by: Anshuman Gupta <anshuman.gupta at intel.com>
> > ---
> > drivers/pci/pci-acpi.c | 78 ++++++++++++++++++++++++++++++++++++++++
> > include/linux/pci-acpi.h | 6 ++++
> > 2 files changed, 84 insertions(+)
> >
> > diff --git a/drivers/pci/pci-acpi.c b/drivers/pci/pci-acpi.c
> > index af370628e583..ebd49e43457e 100644
> > --- a/drivers/pci/pci-acpi.c
> > +++ b/drivers/pci/pci-acpi.c
> > @@ -1421,6 +1421,84 @@ static void pci_acpi_optimize_delay(struct pci_dev *pdev,
> > ACPI_FREE(obj);
> > }
> >
> > +/**
> > + * pci_acpi_request_d3cold_aux_power - Request D3cold aux power via ACPI DSM
> > + * @dev: PCI device instance
> > + * @requested_power: Requested auxiliary power in milliwatts
How's the caller supposed to find out what value to use here?
> > + *
> > + * This function sends a request to the host BIOS via ACPI _DSM Function 10
> > + * to grant the required D3Cold Auxiliary power for the specified PCI device.
> > + * It evaluates the _DSM (Device Specific Method) to request the Auxiliary
> > + * power and handles the response accordingly.
> > + *
> > + * This function shall be only called by 1st non-bridge Endpoint driver
> > + * on Function 0. For a Multi-Function Device, the driver for Function 0 is
> > + * required to report an aggregate power requirement covering all
> > + * functions contained within the device.
> > + *
> > + * Return: Returns 0 on success and errno on failure.
>
> Write all this in imperative mood, e.g.,
>
> Request auxiliary power while device is in D3cold ...
>
> Return 0 on success ...
>
> > + */
> > +int pci_acpi_request_d3cold_aux_power(struct pci_dev *dev, u32 requested_power)
> > +{
> > + union acpi_object in_obj = {
> > + .integer.type = ACPI_TYPE_INTEGER,
> > + .integer.value = requested_power,
> > + };
> > +
> > + union acpi_object *out_obj;
> > + acpi_handle handle;
> > + int result, ret = -EINVAL;
> > +
> > + if (!dev || !ACPI_HANDLE(&dev->dev))
> > + return -EINVAL;
> > +
> > + handle = ACPI_HANDLE(&dev->dev);
Well, ACPI_HANDLE() is not a simple macro, so I'd rather avoid using
it twice in a row for the same device.
What about
if (!dev)
return -EINVAL;
handle = ACPI_HANDLE(&dev->dev);
if (!handle)
return -EINVAL;
>
> This needs an acpi_check_dsm() call to find out whether the platform
> supports DSM_PCI_D3COLD_AUX_POWER_LIMIT.
Absolutely.
> We have several _DSM calls that *should* do this, but unfortunately
> they don't do it yet, so they're not good examples to copy.
Right.
> > + out_obj = acpi_evaluate_dsm_typed(handle,
> > + &pci_acpi_dsm_guid,
> > + 4,
> > + DSM_PCI_D3COLD_AUX_POWER_LIMIT,
> > + &in_obj,
> > + ACPI_TYPE_INTEGER);
> > + if (!out_obj)
> > + return -EINVAL;
> > +
> > + result = out_obj->integer.value;
> > +
> > + switch (result) {
> > + case 0x0:
> > + dev_dbg(&dev->dev, "D3cold Aux Power %umW request denied\n",
> > + requested_power);
>
> Use pci_dbg(dev), pci_info(dev), etc.
>
> > + break;
> > + case 0x1:
> > + dev_info(&dev->dev, "D3cold Aux Power request granted: %umW\n",
> > + requested_power);
> > + ret = 0;
> > + break;
> > + case 0x2:
> > + dev_info(&dev->dev, "D3cold Aux Power: Main power won't be removed\n");
> > + ret = -EBUSY;
> > + break;
> > + default:
> > + if (result >= 0x11 && result <= 0x1F) {
> > + int retry_interval = result & 0xF;
> > +
> > + dev_warn(&dev->dev, "D3cold Aux Power request needs retry."
> > + "Interval: %d seconds.\n", retry_interval);
> > + msleep(retry_interval * 1000);
>
> It doesn't seem right to me to do this sleep internally because it
> means this interface might take up to 15 seconds to return, and the
> caller has no idea what is happening during that time.
I entirely agree here.
If this is going to happen during system suspend, sleeping for seconds
is a total no-go.
> This seems like it should be done in the driver, which can decide
> *whether* it wants to sleep, and if it does sleep, it may give a nice
> indication to the user.
So during a system suspend in progress, this is rather hard to achieve.
I'd say that if the request is not granted right away, it is a failure
and we don't use D3cold.
> Of course, that would mean returning some kind of retry interval
> information to the caller.
>
> > + ret = -EAGAIN;
> > + } else {
> > + dev_err(&dev->dev, "D3cold Aux Power: Reserved or "
> > + "unsupported response: 0x%x.\n", result);
>
> Drop periods at end of messages.
>
> > + }
> > + break;
> > + }
> > +
> > + ACPI_FREE(out_obj);
> > + return ret;
> > +}
> > +EXPORT_SYMBOL(pci_acpi_request_d3cold_aux_power);
More information about the Intel-xe
mailing list