[PATCH 01/12] PCI/ACPI: Add D3cold Aux Power Limit_DSM method

Bjorn Helgaas helgaas at kernel.org
Tue Apr 1 18:25:45 UTC 2025


On Tue, Apr 01, 2025 at 09:02:14PM +0530, Anshuman Gupta wrote:
> Implement _DSM method 10 and _DSM Method 11 as per PCI firmware specs
> section 4.6.10 Rev 3.3.

Thanks for splitting this into two patches.  But I think now this only
implements function 10 (0x0a), so this sentence needs to be updated.

I would write this consistently as "0x0a" or "0Ah" to match the spec
description.  I don't think the spec ever uses "10".

> Note that this implementation assumes only a single device below the
> Downstream Port will request for Aux Power Limit under a given
> Root Port because it does not track and aggregate requests
> from all child devices below the Downstream Port as required
> by Section 4.6.10 Rev 3.3.
> 
> One possible mitigation would be only allowing only first PCIe
> Non-Bridge Endpoint Function 0 driver to call_DSM method 10.
> 
> Signed-off-by: Varun Gupta <varun.gupta at intel.com>
> Signed-off-by: Badal Nilawar <badal.nilawar at intel.com>
> Signed-off-by: Anshuman Gupta <anshuman.gupta at intel.com>
> ---
>  drivers/pci/pci-acpi.c   | 78 ++++++++++++++++++++++++++++++++++++++++
>  include/linux/pci-acpi.h |  6 ++++
>  2 files changed, 84 insertions(+)
> 
> diff --git a/drivers/pci/pci-acpi.c b/drivers/pci/pci-acpi.c
> index af370628e583..ebd49e43457e 100644
> --- a/drivers/pci/pci-acpi.c
> +++ b/drivers/pci/pci-acpi.c
> @@ -1421,6 +1421,84 @@ static void pci_acpi_optimize_delay(struct pci_dev *pdev,
>  	ACPI_FREE(obj);
>  }
>  
> +/**
> + * pci_acpi_request_d3cold_aux_power - Request D3cold aux power via ACPI DSM
> + * @dev: PCI device instance
> + * @requested_power: Requested auxiliary power in milliwatts
> + *
> + * This function sends a request to the host BIOS via ACPI _DSM Function 10
> + * to grant the required D3Cold Auxiliary power for the specified PCI device.
> + * It evaluates the _DSM (Device Specific Method) to request the Auxiliary
> + * power and handles the response accordingly.
> + *
> + * This function shall be only called by 1st non-bridge Endpoint driver
> + * on Function 0. For a Multi-Function Device, the driver for Function 0 is
> + * required to report an aggregate power requirement covering all
> + * functions contained within the device.
> + *
> + * Return: Returns 0 on success and errno on failure.

Write all this in imperative mood, e.g.,

  Request auxiliary power while device is in D3cold ...

  Return 0 on success ...

> + */
> +int pci_acpi_request_d3cold_aux_power(struct pci_dev *dev, u32 requested_power)
> +{
> +	union acpi_object in_obj = {
> +		.integer.type = ACPI_TYPE_INTEGER,
> +		.integer.value = requested_power,
> +	};
> +
> +	union acpi_object *out_obj;
> +	acpi_handle handle;
> +	int result, ret = -EINVAL;
> +
> +	if (!dev || !ACPI_HANDLE(&dev->dev))
> +		return -EINVAL;
> +
> +	handle = ACPI_HANDLE(&dev->dev);

This needs an acpi_check_dsm() call to find out whether the platform
supports DSM_PCI_D3COLD_AUX_POWER_LIMIT.

We have several _DSM calls that *should* do this, but unfortunately
they don't do it yet, so they're not good examples to copy.

> +	out_obj = acpi_evaluate_dsm_typed(handle,
> +					  &pci_acpi_dsm_guid,
> +					  4,
> +					  DSM_PCI_D3COLD_AUX_POWER_LIMIT,
> +					  &in_obj,
> +					  ACPI_TYPE_INTEGER);
> +	if (!out_obj)
> +		return -EINVAL;
> +
> +	result = out_obj->integer.value;
> +
> +	switch (result) {
> +	case 0x0:
> +		dev_dbg(&dev->dev, "D3cold Aux Power %umW request denied\n",
> +			requested_power);

Use pci_dbg(dev), pci_info(dev), etc.

> +		break;
> +	case 0x1:
> +		dev_info(&dev->dev, "D3cold Aux Power request granted: %umW\n",
> +			 requested_power);
> +		ret = 0;
> +		break;
> +	case 0x2:
> +		dev_info(&dev->dev, "D3cold Aux Power: Main power won't be removed\n");
> +		ret = -EBUSY;
> +		break;
> +	default:
> +		if (result >= 0x11 && result <= 0x1F) {
> +			int retry_interval = result & 0xF;
> +
> +			dev_warn(&dev->dev, "D3cold Aux Power request needs retry."
> +				 "Interval: %d seconds.\n", retry_interval);
> +			msleep(retry_interval * 1000);

It doesn't seem right to me to do this sleep internally because it
means this interface might take up to 15 seconds to return, and the
caller has no idea what is happening during that time.

This seems like it should be done in the driver, which can decide
*whether* it wants to sleep, and if it does sleep, it may give a nice
indication to the user.

Of course, that would mean returning some kind of retry interval
information to the caller.

> +			ret = -EAGAIN;
> +		} else {
> +			dev_err(&dev->dev, "D3cold Aux Power: Reserved or "
> +				"unsupported response: 0x%x.\n", result);

Drop periods at end of messages.

> +		}
> +		break;
> +	}
> +
> +	ACPI_FREE(out_obj);
> +	return ret;
> +}
> +EXPORT_SYMBOL(pci_acpi_request_d3cold_aux_power);


More information about the Intel-xe mailing list