[PATCH v2] drm/xe/pm: Disable RPM for SR-IOV VFs with no PCIe PM capability
Michal Wajdeczko
michal.wajdeczko at intel.com
Thu Jul 31 15:55:40 UTC 2025
On 7/31/2025 5:32 PM, Satyanarayana K V P wrote:
> Enable Runtime Power Management (RPM) for PCI Express devices by utilizing
> their native Power Management (PM) capabilities. The specification (as per
> section 5.10.1 in PCI Express® Base Specification Revision 7.0) mandates
> that Virtual Functions (VFs) without Power Management capability inherit
> their associated Physical Function's (PF) power state.
>
> As per PCIe spec "If a VF does not implement the PCI Power Management
> Capability, then the VF behaves as if it had been programmed into the
> equivalent power state of its associated PF"
>
> Since Intel GPU VFs lack PM capability implementations, VFs power behavior
> must mirror their PF's state. During VF creation, the PF remains active
> from the PCI subsystem perspective. To maintain consistency, explicitly
> disable RPM for VFs missing PM capability to ensure they follow their PF's
> power management status rather than entering low-power states
> independently.
>
> Signed-off-by: Satyanarayana K V P <satyanarayana.k.v.p at intel.com>
> Cc: Michal Wajdeczko <michal.wajdeczko at intel.com>
> Cc: Matthew Brost <matthew.brost at intel.com>
> Cc: Michał Winiarski <michal.winiarski at intel.com>
> Cc: Anshuman Gupta <anshuman.gupta at intel.com>
> Cc: Rodrigo Vivi <rodrigo.vivi at intel.com>
>
> ---
> V1 -> V2:
> - Disable RPM only for VF devices when PM cap is not implemented.
> ---
> drivers/gpu/drm/xe/xe_device_types.h | 2 ++
> drivers/gpu/drm/xe/xe_pm.c | 9 +++++++++
> drivers/gpu/drm/xe/xe_pm.h | 5 +++++
> 3 files changed, 16 insertions(+)
>
> diff --git a/drivers/gpu/drm/xe/xe_device_types.h b/drivers/gpu/drm/xe/xe_device_types.h
> index 38c8329b4d2c..3bbfc46044a0 100644
> --- a/drivers/gpu/drm/xe/xe_device_types.h
> +++ b/drivers/gpu/drm/xe/xe_device_types.h
> @@ -285,6 +285,8 @@ struct xe_device {
> * pcode mailbox commands.
> */
> u8 has_mbx_power_limits:1;
> + /** @info.has_pm_capability: Device has PCI pm capability */
> + u8 has_pm_capability:1;
> /** @info.has_pxp: Device has PXP support */
> u8 has_pxp:1;
> /** @info.has_range_tlb_invalidation: Has range based TLB invalidations */
> diff --git a/drivers/gpu/drm/xe/xe_pm.c b/drivers/gpu/drm/xe/xe_pm.c
> index 44aaf154ddf7..5e6311964685 100644
> --- a/drivers/gpu/drm/xe/xe_pm.c
> +++ b/drivers/gpu/drm/xe/xe_pm.c
> @@ -244,6 +244,9 @@ static void xe_pm_runtime_init(struct xe_device *xe)
> {
> struct device *dev = xe->drm.dev;
>
maybe we can put some comment here?
/* enable RPM only if device has PCI PM capability */
> + if (!IS_RPM_SUPPORTED(xe))
> + return;
> +
> /*
> * Disable the system suspend direct complete optimization.
> * We need to ensure that the regular device suspend/resume functions
> @@ -265,6 +268,7 @@ static void xe_pm_runtime_init(struct xe_device *xe)
>
> int xe_pm_init_early(struct xe_device *xe)
> {
> + struct pci_dev *pdev = to_pci_dev(xe->drm.dev);
> int err;
>
> INIT_LIST_HEAD(&xe->mem_access.vram_userfault.list);
> @@ -278,6 +282,8 @@ int xe_pm_init_early(struct xe_device *xe)
> return err;
>
> xe->d3cold.capable = xe_pm_pci_d3cold_capable(xe);
> + xe->info.has_pm_capability = !!pdev->pm_cap;
if this cap is always available from pdev then there is
no need to cache it in xe->info
> +
> return 0;
> }
> ALLOW_ERROR_INJECTION(xe_pm_init_early, ERRNO); /* See xe_pci_probe() */
> @@ -364,6 +370,9 @@ static void xe_pm_runtime_fini(struct xe_device *xe)
> {
> struct device *dev = xe->drm.dev;
>
> + if (!IS_RPM_SUPPORTED(xe))
> + return;
> +
> pm_runtime_get_sync(dev);
> pm_runtime_forbid(dev);
I'm wondering if maybe we can get rid of this fini()
by using managed versions of the rpm functions like:
devm_pm_runtime_enable()
@Rodrigo ?
> }
> diff --git a/drivers/gpu/drm/xe/xe_pm.h b/drivers/gpu/drm/xe/xe_pm.h
> index 59678b310e55..ec6291dad019 100644
> --- a/drivers/gpu/drm/xe/xe_pm.h
> +++ b/drivers/gpu/drm/xe/xe_pm.h
> @@ -9,6 +9,11 @@
> #include <linux/pm_runtime.h>
>
> #define DEFAULT_VRAM_THRESHOLD 300 /* in MB */
> +#define IS_RPM_SUPPORTED(xe) ({ \
> + struct xe_device *___xe = (xe); \
> + ___xe->info.has_pm_capability || \
> + !IS_SRIOV_VF(___xe); \
> + })
this seems overkill and unnecessary
>
> struct xe_device;
>
More information about the Intel-xe
mailing list