[PATCH 03/12] PCI/ACPI: Add aux power grant notifier

Rafael J. Wysocki rafael at kernel.org
Thu Apr 3 13:48:48 UTC 2025


On Thu, Apr 3, 2025 at 9:57 AM Gupta, Anshuman <anshuman.gupta at intel.com> wrote:
>
> > -----Original Message-----
> > From: Bjorn Helgaas <helgaas at kernel.org>
> > Sent: Wednesday, April 2, 2025 1:44 AM
> > To: Gupta, Anshuman <anshuman.gupta at intel.com>
> > Cc: intel-xe at lists.freedesktop.org; linux-acpi at vger.kernel.org; linux-
> > pci at vger.kernel.org; rafael at kernel.org; lenb at kernel.org;
> > bhelgaas at google.com; ilpo.jarvinen at linux.intel.com; De Marchi, Lucas
> > <lucas.demarchi at intel.com>; Vivi, Rodrigo <rodrigo.vivi at intel.com>; Nilawar,
> > Badal <badal.nilawar at intel.com>; Gupta, Varun <varun.gupta at intel.com>;
> > ville.syrjala at linux.intel.com; Shankar, Uma <uma.shankar at intel.com>
> > Subject: Re: [PATCH 03/12] PCI/ACPI: Add aux power grant notifier
> >
> > On Tue, Apr 01, 2025 at 09:02:16PM +0530, Anshuman Gupta wrote:
> > > Adding a notifier to notify all PCIe child devices about the block
> > > main power removal. It is needed because theoretically multiple PCIe
> > > Endpoint devices on same Root Port can request for AUX power and _DSM
> > > method can return with 80000000h signifies that the hierarchy
> > > connected via the slot cannot support main power removal when in
> > > D3Cold.
> >
> > I wish the spec used different language here.  "D3cold" *means* "main power
> > is removed" (PCIe r6.0, sec 5.3.1.4.2), so it doesn't make sense to say that
> > "the slot cannot support main power removal when in D3cold".  If a device is
> > in D3cold, its main power has been removed by definition.
> >
> > I suppose the spec is trying to say if the driver has called this _DSM with
> > 80000000h, it means the platform must not remove main power from the
> > device while the system is in S0?  Is that what you think it means?
> Thanks for review.
> what I understand here, S0 term essentially means here PCIe Runtime PM or s2idle
> (both refers system is S0 state). AFAIU during both Runtime PM and s2ilde path
> Root Port can transition  to D3Cold if it support _PR3.
> I observed Root Port either have D0 or D3Cold state.
> /sys/bus/pci0/devices/$root_port_bdf/firmware_node/real_power_state
>
> Driver can disable the D3cold by pci_d3cold_disable(), so I wonder there is no use
> case driver calling _DSM 0xa with 80000000h.
> PCIe specs shall be simplified by removal of value 80000000h from _DSM 0xa Arg.
> >
> > The 2h return value description says it "indicates that the platform will not
> > remove main power from the slot while the system is in S0,"
> > so I guess that must be it.
> >
> > In this series, pci_acpi_aux_power_setup() only supplies a 16-bit
> > aux_pwr_limit value, so the driver cannot *request* that the platform not
> > remove main power.
> Yeah but that value Xe driver received from GPU firmware.
> >
> > So I guess the only scenario where the _DSM returns 80000000h is when the
> > platform itself or other devices prevent the removal of main power.  And the
> > point of the notifier is to tell devices that their main power will never be
> > removed while the system is in S0.  Is that right?
> Yes the notifier will safeguard against the use case if some other PCIe device
> calls _DSM 0xa with Arg3 Value of 80000000h or its request return with 0x2h.
> >
> > > This may disrupt functionality of other child device.
> >
> > What sort of disruption could happen?  I would think that if the _DSM returns
> > 80000000h, it just means the device will not have main power removed, so it
> > won't see that power state transition.  The only "disruption" would be that
> > we're using more power.
>
> Let's say during Xe Driver initialization BIOS firmware grants the Xe driver
> Aux power request successfully.

Slightly OT, but if you do it at init time, you also need to do it
after hibernation because the original ini-time request may not
survive it.

> Xe driver will prepare to transition D3Cold state with VRAM Self Refresh.
> VRAM Self Refresh relies on vram shall derive power from Aux power.
> But later at any point if some other PCIe device under same root port
> Calls _DSM either with 080000000h or _DSM returns with 02h, will
> Block the main power removal.

As I said in the previous message, this cannot happen if only one
driver at a time is allowed to request the Aux power on a given Root
Port.

If multiple drivers are allowed to do it, aggregation of requests is necessary.

> But Xe driver is unaware of it can still
> program the VRAM Self Refresh and that make VRAM to derive power
> from Aux and that will disrupt the GPU VRAM as there is no switch to Aux

This is a rather unfortunate and counter-intuitive design decision
IMV.  If the main power is still there, I'd expect the VRAM Self
Refresh to use it instead of the Aux power.


More information about the Intel-xe mailing list