[PATCH 01/14] drm/xe: Document Xe PM component

Rodrigo Vivi rodrigo.vivi at intel.com
Wed Feb 21 19:00:29 UTC 2024


On Wed, Feb 21, 2024 at 09:41:44AM -0500, Gupta, Anshuman wrote:
> 
> 
> > -----Original Message-----
> > From: Dugast, Francois <francois.dugast at intel.com>
> > Sent: Wednesday, February 21, 2024 5:38 PM
> > To: Vivi, Rodrigo <rodrigo.vivi at intel.com>
> > Cc: intel-xe at lists.freedesktop.org; Auld, Matthew <matthew.auld at intel.com>;
> > Gupta, Anshuman <anshuman.gupta at intel.com>
> > Subject: Re: [PATCH 01/14] drm/xe: Document Xe PM component
> > 
> > On Thu, Feb 15, 2024 at 02:34:17PM -0500, Rodrigo Vivi wrote:
> > > Replace outdated information with a proper PM documentation.
> > > Already establish the rules for the runtime PM get and put that Xe
> > > needs to follow.
> > >
> > > Also add missing function documentation to all the "exported" functions.
> > >
> > > v2: updated after Francois' feedback.
> > >     s/grater/greater (Matt)
> > >
> > > Cc: Matthew Auld <matthew.auld at intel.com>
> > > Cc: Anshuman Gupta <anshuman.gupta at intel.com>
> > > Signed-off-by: Rodrigo Vivi <rodrigo.vivi at intel.com>
> > > Acked-by: Francois Dugast <francois.dugast at intel.com>
> > > ---
> > >  drivers/gpu/drm/xe/xe_pm.c | 108
> > > +++++++++++++++++++++++++++++++++----
> > >  1 file changed, 97 insertions(+), 11 deletions(-)
> > >
> > > diff --git a/drivers/gpu/drm/xe/xe_pm.c b/drivers/gpu/drm/xe/xe_pm.c
> > > index ab283e9a8b4e..64ffb9a35448 100644
> > > --- a/drivers/gpu/drm/xe/xe_pm.c
> > > +++ b/drivers/gpu/drm/xe/xe_pm.c
> > > @@ -25,21 +25,46 @@
> > >  /**
> > >   * DOC: Xe Power Management
> > >   *
> > > - * Xe PM shall be guided by the simplicity.
> > > - * Use the simplest hook options whenever possible.
> > > - * Let's not reinvent the runtime_pm references and hooks.
> > > - * Shall have a clear separation of display and gt underneath this component.
> > > + * Xe PM implements the main routines for both system level suspend
> > > + states and
> > > + * for the opportunistic runtime suspend states.
> > >   *
> > > - * What's next:
> > > + * System Level Suspend (S-States) - In general this is OS initiated
> > > + suspend
> > > + * driven by ACPI for achieving S0ix (a.k.a. S2idle, freeze), S3
> > > + (suspend to ram),
> > > + * S4 (disk). The main functions here are `xe_pm_suspend` and
> > > + `xe_pm_resume`. They
> > > + * are the main point for the suspend to and resume from these states.
> > >   *
> > > - * For now s2idle and s3 are only working in integrated devices. The
> > > next step
> > > - * is to iterate through all VRAM's BO backing them up into the
> > > system memory
> > > - * before allowing the system suspend.
> > > + * Runtime Suspend (D-States) - This is the opportunistic PCIe device
> > > + low power
> > > + * state D3. Xe PM component provides `xe_pm_runtime_suspend` and
> > > + * `xe_pm_runtime_resume` systems that PCI subsystem will call before
> > > + transition
> > > + * to D3. Also, Xe PM provides get and put functions that Xe driver
> > > + will use to
> > > + * indicate activity. In order to avoid locking complications with
> > > + the memory
> > > + * management, whenever possible, these get and put functions needs
> > > + to be called
> > > + * from the higher/outer levels.
> > >   *
> > > - * Also runtime_pm needs to be here from the beginning.
> > > + * The main cases that need to be protected from the outer levels
> > > + are: IOCTL,
> > > + * sysfs, debugfs, dma-buf sharing, GPU execution.
> > >   *
> > > - * RC6/RPS are also critical PM features. Let's start with GuCRC and
> > > GuC SLPC
> > > - * and no wait boost. Frequency optimizations should come on a next stage.
> > > + * PCI D3 is special and can mean D3hot, where Vcc power is on for
> > > + keeping memory
> > > + * alive and quicker low latency resume or D3Cold where Vcc power is
> > > + off for
> > > + * better power savings.
> > > + * The Vcc control of PCI hierarchy can only be controlled at the PCI
> > > + root port
> > > + * level, while the device driver can be behind multiple
> > > + bridges/switches and
> > > + * paired with other devices. For this reason, the PCI subsystem
> > > + cannot perform
> > > + * the transition towards D3Cold. The lowest runtime PM possible from
> > > + the PCI
> > > + * subsystem is D3hot. Then, if all these paired devices in the same
> > > + root port
> > > + * are in D3hot, ACPI will assist here and run its own methods (_PR3
> > > + and _OFF)
> > > + * to perform the transition from D3hot to D3cold. Xe may disallow
> > > + this
> > > + * transition by calling pci_d3cold_disable(root_pdev) before going
> > > + to runtime
> > > + * suspend. It will be based on runtime conditions such as VRAM usage
> > > + for a
> > > + * quick and low latency resume for instance.
> > > + *
> > > + * Intel systems are capable of taking the system to S0ix when
> > > + devices are on
> > > + * D3hot through the runtime PM. This is also called as 'opportunistic-S0iX'.
> > > + * But in this case, the `xe_pm_suspend` and `xe_pm_resume` won't be
> > > + called for
> > > + * S0ix.
> Hi Rodrigo,
> Sorry of late review comment, was busy with other stuff.

no worries. Thank you so much for jumping in. Really appreciated.

> we do need modify the doc slightly.

indeed... I also realized that to avoid confusion we need to detach the D3.
In our integrated devices it is power important for the package-C states then
for 'D3' itself. One might look at our PCI config for integrated and see D3hot- and
get confused.

> For integrated graphics D3hot is not necessary to achieve for the host s0ix, like with PSR panel
> PCI device will be in D0 state as KMS CRTC will be active and will  held a device wakeref.

Then let's also detach the opportunistic s0ix from the runtime_pm with this PSR
case in mind.

> 
> AFAIK for discrete card perspective, host s0ix requires PCIe link to be in low power state.

yeap, but that's outside of our scope so let's minimize the doc to the calls
around the functions that we are implementing here.

So, what do you think of this diff between what I had here in this patch and
what I'm seending soon on a v2:

--- a/drivers/gpu/drm/xe/xe_pm.c
+++ b/drivers/gpu/drm/xe/xe_pm.c
@@ -33,17 +33,9 @@
  * S4 (disk). The main functions here are `xe_pm_suspend` and `xe_pm_resume`. They
  * are the main point for the suspend to and resume from these states.
  *
- * Runtime Suspend (D-States) - This is the opportunistic PCIe device low power
- * state D3. Xe PM component provides `xe_pm_runtime_suspend` and
- * `xe_pm_runtime_resume` systems that PCI subsystem will call before transition
- * to D3. Also, Xe PM provides get and put functions that Xe driver will use to
- * indicate activity. In order to avoid locking complications with the memory
- * management, whenever possible, these get and put functions needs to be called
- * from the higher/outer levels.
- *
- * The main cases that need to be protected from the outer levels are: IOCTL,
- * sysfs, debugfs, dma-buf sharing, GPU execution.
- *
+ * PCI Device Suspend (D-States) - This is the opportunistic PCIe device low power
+ * state D3, controlled by the PCI subsystem and ACPI with the help from the
+ * runtime_pm infrastructure.
  * PCI D3 is special and can mean D3hot, where Vcc power is on for keeping memory
  * alive and quicker low latency resume or D3Cold where Vcc power is off for
  * better power savings.
@@ -58,12 +50,28 @@
  * suspend. It will be based on runtime conditions such as VRAM usage for a
  * quick and low latency resume for instance.
  *
- * Intel systems are capable of taking the system to S0ix when devices are on
- * D3hot through the runtime PM. This is also called as 'opportunistic-S0iX'.
- * But in this case, the `xe_pm_suspend` and `xe_pm_resume` won't be called for
- * S0ix.
+ * Runtime PM - This infrastructure provided by the Linux kernel allows the
+ * device drivers to indicate when the can be runtime suspended, so the device
+ * could be put at D3 (if supported), or allow deeper package sleep states
+ * (PC-states), and/or other low level power states. Xe PM component provides
+ * `xe_pm_runtime_suspend` and `xe_pm_runtime_resume` functions that PCI
+ * subsystem will call before transition to/from runtime suspend.
+ *
+ * Also, Xe PM provides get and put functions that Xe driver will use to
+ * indicate activity. In order to avoid locking complications with the memory
+ * management, whenever possible, these get and put functions needs to be called
+ * from the higher/outer levels.
+ * The main cases that need to be protected from the outer levels are: IOCTL,
+ * sysfs, debugfs, dma-buf sharing, GPU execution.
+ *
+ * Intel systems are capable of taking the system to S0ix when all certain
+ * conditions are met. This is also called as 'opportunistic-S0iX'.
+ * But in this case, the `xe_pm_suspend` and `xe_pm_resume` won't be called
+ * for S0ix. In certain cases, like when Display Panel Self-Refresh (eDP PSR) is
+ * active, not even `xe_pm_runtime_suspend` and `xe_pm_runtime_resume` will be
+ * called.
  *
- * This component is no responsible for GT idleness (RC6) nor GT frequency
+ * This component is not responsible for GT idleness (RC6) nor GT frequency
  * management (RPS).
  */

> Thanks,
> Anshuman Gupta.
> > > + *
> > > + * This component is no responsible for GT idleness (RC6) nor GT
> > > + frequency
> > 
> > Isn't it s/no/not/? Or s/no/neither/ + s/nor/nor for/? In any case:
> > 
> > Reviewed-by: Francois Dugast <francois.dugast at intel.com>
> > 
> > Francois
> > 
> > > + * management (RPS).
> > >   */
> > >
> > >  /**
> > > @@ -178,6 +203,12 @@ void xe_pm_init_early(struct xe_device *xe)
> > >  	drmm_mutex_init(&xe->drm, &xe->mem_access.vram_userfault.lock);
> > >  }
> > >
> > > +/**
> > > + * xe_pm_init - Initialize Xe Power Management
> > > + * @xe: xe device instance
> > > + *
> > > + * This component is responsible for System and Device sleep states.
> > > + */
> > >  void xe_pm_init(struct xe_device *xe)  {
> > >  	/* For now suspend/resume is only allowed with GuC */ @@ -196,6
> > > +227,10 @@ void xe_pm_init(struct xe_device *xe)
> > >  	xe_pm_runtime_init(xe);
> > >  }
> > >
> > > +/**
> > > + * xe_pm_runtime_fini - Finalize Runtime PM
> > > + * @xe: xe device instance
> > > + */
> > >  void xe_pm_runtime_fini(struct xe_device *xe)  {
> > >  	struct device *dev = xe->drm.dev;
> > > @@ -225,6 +260,12 @@ struct task_struct *xe_pm_read_callback_task(struct
> > xe_device *xe)
> > >  	return READ_ONCE(xe->pm_callback_task);  }
> > >
> > > +/**
> > > + * xe_pm_runtime_suspend - Prepare our device for D3hot/D3Cold
> > > + * @xe: xe device instance
> > > + *
> > > + * Returns 0 for success, negative error code otherwise.
> > > + */
> > >  int xe_pm_runtime_suspend(struct xe_device *xe)  {
> > >  	struct xe_bo *bo, *on;
> > > @@ -290,6 +331,12 @@ int xe_pm_runtime_suspend(struct xe_device *xe)
> > >  	return err;
> > >  }
> > >
> > > +/**
> > > + * xe_pm_runtime_resume - Waking up from D3hot/D3Cold
> > > + * @xe: xe device instance
> > > + *
> > > + * Returns 0 for success, negative error code otherwise.
> > > + */
> > >  int xe_pm_runtime_resume(struct xe_device *xe)  {
> > >  	struct xe_gt *gt;
> > > @@ -341,22 +388,47 @@ int xe_pm_runtime_resume(struct xe_device *xe)
> > >  	return err;
> > >  }
> > >
> > > +/**
> > > + * xe_pm_runtime_get - Get a runtime_pm reference and resume
> > > +synchronously
> > > + * @xe: xe device instance
> > > + *
> > > + * Returns: Any number greater than or equal to 0 for success,
> > > +negative error
> > > + * code otherwise.
> > > + */
> > >  int xe_pm_runtime_get(struct xe_device *xe)  {
> > >  	return pm_runtime_get_sync(xe->drm.dev);  }
> > >
> > > +/**
> > > + * xe_pm_runtime_put - Put the runtime_pm reference back and mark as
> > > +idle
> > > + * @xe: xe device instance
> > > + *
> > > + * Returns: Any number greater than or equal to 0 for success,
> > > +negative error
> > > + * code otherwise.
> > > + */
> > >  int xe_pm_runtime_put(struct xe_device *xe)  {
> > >  	pm_runtime_mark_last_busy(xe->drm.dev);
> > >  	return pm_runtime_put(xe->drm.dev);
> > >  }
> > >
> > > +/**
> > > + * xe_pm_runtime_get_if_active - Get a runtime_pm reference if device
> > > +active
> > > + * @xe: xe device instance
> > > + *
> > > + * Returns: Any number greater than or equal to 0 for success,
> > > +negative error
> > > + * code otherwise.
> > > + */
> > >  int xe_pm_runtime_get_if_active(struct xe_device *xe)  {
> > >  	return pm_runtime_get_if_active(xe->drm.dev, true);  }
> > >
> > > +/**
> > > + * xe_pm_assert_unbounded_bridge - Disable PM on unbounded pcie
> > > +parent bridge
> > > + * @xe: xe device instance
> > > + */
> > >  void xe_pm_assert_unbounded_bridge(struct xe_device *xe)  {
> > >  	struct pci_dev *pdev = to_pci_dev(xe->drm.dev); @@ -371,6 +443,13
> > @@
> > > void xe_pm_assert_unbounded_bridge(struct xe_device *xe)
> > >  	}
> > >  }
> > >
> > > +/**
> > > + * xe_pm_set_vram_threshold - Set a vram threshold for
> > > +allowing/blocking D3Cold
> > > + * @xe: xe device instance
> > > + * @threshold: VRAM size in bites for the D3cold threshold
> > > + *
> > > + * Returns 0 for success, negative error code otherwise.
> > > + */
> > >  int xe_pm_set_vram_threshold(struct xe_device *xe, u32 threshold)  {
> > >  	struct ttm_resource_manager *man;
> > > @@ -395,6 +474,13 @@ int xe_pm_set_vram_threshold(struct xe_device *xe,
> > u32 threshold)
> > >  	return 0;
> > >  }
> > >
> > > +/**
> > > + * xe_pm_d3cold_allowed_toggle - Check conditions to toggle
> > > +d3cold.allowed
> > > + * @xe: xe device instance
> > > + *
> > > + * To be called during runtime_pm idle callback.
> > > + * Check for all the D3Cold conditions ahead of runtime suspend.
> > > + */
> > >  void xe_pm_d3cold_allowed_toggle(struct xe_device *xe)  {
> > >  	struct ttm_resource_manager *man;
> > > --
> > > 2.43.0
> > >


More information about the Intel-xe mailing list