[PATCH v3 2/3] drm/xe: add documentation for survivability mode

Lucas De Marchi lucas.demarchi at intel.com
Thu Apr 3 21:50:46 UTC 2025


On Thu, Apr 03, 2025 at 03:55:52PM +0530, Riana Tauro wrote:
>Add survivability mode document to pcode document

Add survivability mode in the pcode documentation.

... One additional line justifying why this is the proper place wouldn't
hurt too.

>
>Signed-off-by: Riana Tauro <riana.tauro at intel.com>
>---
> Documentation/gpu/xe/xe_pcode.rst          |  7 +++++
> drivers/gpu/drm/xe/xe_survivability_mode.c | 36 +++++++++++++++-------
> 2 files changed, 32 insertions(+), 11 deletions(-)
>
>diff --git a/Documentation/gpu/xe/xe_pcode.rst b/Documentation/gpu/xe/xe_pcode.rst
>index d2e22cc45061..5937ef3599b0 100644
>--- a/Documentation/gpu/xe/xe_pcode.rst
>+++ b/Documentation/gpu/xe/xe_pcode.rst
>@@ -12,3 +12,10 @@ Internal API
>
> .. kernel-doc:: drivers/gpu/drm/xe/xe_pcode.c
>    :internal:
>+
>+==================
>+Boot Survivability
>+==================
>+
>+.. kernel-doc:: drivers/gpu/drm/xe/xe_survivability_mode.c
>+   :doc: Xe Boot Survivability
>diff --git a/drivers/gpu/drm/xe/xe_survivability_mode.c b/drivers/gpu/drm/xe/xe_survivability_mode.c
>index cb813b337fd3..3d59753eae34 100644
>--- a/drivers/gpu/drm/xe/xe_survivability_mode.c
>+++ b/drivers/gpu/drm/xe/xe_survivability_mode.c
>@@ -28,20 +28,34 @@
>  * This is implemented by loading the driver with bare minimum (no drm card) to allow the firmware
>  * to be flashed through mei and collect telemetry. The driver's probe flow is modified
>  * such that it enters survivability mode when pcode initialization is incomplete and boot status
>- * denotes a failure. The driver then  populates the survivability_mode PCI sysfs indicating
>- * survivability mode and provides additional information required for debug
>+ * denotes a failure.
>  *
>- * KMD exposes below admin-only readable sysfs in survivability mode
>+ * Survivability mode can also be entered manually using the survivability mode attribute available
>+ * through configfs which is beneficial in several usecases. It can be used to address scenarios
>+ * where pcode does not detect failure or for validation purposes. It can also be used in
>+ * In-Field-Repair (IFR) to repair a single card without impacting the other cards in a node.
>  *
>- * device/survivability_mode: The presence of this file indicates that the card is in survivability
>- *			      mode. Also, provides additional information on why the driver entered
>- *			      survivability mode.
>+ * Use below command enable survivability mode manually
>+ * ::

same comment as in patch 1. These "::" look misplaced

>  *
>- *			      Capability Information - Provides boot status
>- *			      Postcode Information   - Provides information about the failure
>- *			      Overflow Information   - Provides history of previous failures
>- *			      Auxiliary Information  - Certain failures may have information in
>- *						       addition to postcode information
>+ *	# echo 1 > /sys/kernel/config/xe/0000:03:00.0/survivability_mode
>+ *
>+ * Refer :ref:`xe_configfs` for more details on how to use configfs
>+ *
>+ * Survivability mode is indicated by the below admin-only readable sysfs which provides additional
>+ * debug information
>+ * ::
>+ *
>+ *	/sys/bus/pci/<device>/surivability_mode

/sys/bus/pci/devices/<device>/surivability_mode

Reviewed-by: Lucas De Marchi <lucas.demarchi at intel.com>

Lucas De Marchi

>+ *
>+ * Capability Information:
>+ *	Provides boot status
>+ * Postcode Information:
>+ *	Provides information about the failure
>+ * Overflow Information
>+ *	Provides history of previous failures
>+ * Auxiliary Information
>+ *	Certain failures may have information in addition to postcode information
>  */
>
> static u32 aux_history_offset(u32 reg_value)
>-- 
>2.47.1
>


More information about the Intel-xe mailing list