[PATCH v3 2/3] drm/xe: add documentation for survivability mode
Lucas De Marchi
lucas.demarchi at intel.com
Thu Apr 3 21:50:46 UTC 2025
On Thu, Apr 03, 2025 at 03:55:52PM +0530, Riana Tauro wrote:
>Add survivability mode document to pcode document
Add survivability mode in the pcode documentation.
... One additional line justifying why this is the proper place wouldn't
hurt too.
>
>Signed-off-by: Riana Tauro <riana.tauro at intel.com>
>---
> Documentation/gpu/xe/xe_pcode.rst | 7 +++++
> drivers/gpu/drm/xe/xe_survivability_mode.c | 36 +++++++++++++++-------
> 2 files changed, 32 insertions(+), 11 deletions(-)
>
>diff --git a/Documentation/gpu/xe/xe_pcode.rst b/Documentation/gpu/xe/xe_pcode.rst
>index d2e22cc45061..5937ef3599b0 100644
>--- a/Documentation/gpu/xe/xe_pcode.rst
>+++ b/Documentation/gpu/xe/xe_pcode.rst
>@@ -12,3 +12,10 @@ Internal API
>
> .. kernel-doc:: drivers/gpu/drm/xe/xe_pcode.c
> :internal:
>+
>+==================
>+Boot Survivability
>+==================
>+
>+.. kernel-doc:: drivers/gpu/drm/xe/xe_survivability_mode.c
>+ :doc: Xe Boot Survivability
>diff --git a/drivers/gpu/drm/xe/xe_survivability_mode.c b/drivers/gpu/drm/xe/xe_survivability_mode.c
>index cb813b337fd3..3d59753eae34 100644
>--- a/drivers/gpu/drm/xe/xe_survivability_mode.c
>+++ b/drivers/gpu/drm/xe/xe_survivability_mode.c
>@@ -28,20 +28,34 @@
> * This is implemented by loading the driver with bare minimum (no drm card) to allow the firmware
> * to be flashed through mei and collect telemetry. The driver's probe flow is modified
> * such that it enters survivability mode when pcode initialization is incomplete and boot status
>- * denotes a failure. The driver then populates the survivability_mode PCI sysfs indicating
>- * survivability mode and provides additional information required for debug
>+ * denotes a failure.
> *
>- * KMD exposes below admin-only readable sysfs in survivability mode
>+ * Survivability mode can also be entered manually using the survivability mode attribute available
>+ * through configfs which is beneficial in several usecases. It can be used to address scenarios
>+ * where pcode does not detect failure or for validation purposes. It can also be used in
>+ * In-Field-Repair (IFR) to repair a single card without impacting the other cards in a node.
> *
>- * device/survivability_mode: The presence of this file indicates that the card is in survivability
>- * mode. Also, provides additional information on why the driver entered
>- * survivability mode.
>+ * Use below command enable survivability mode manually
>+ * ::
same comment as in patch 1. These "::" look misplaced
> *
>- * Capability Information - Provides boot status
>- * Postcode Information - Provides information about the failure
>- * Overflow Information - Provides history of previous failures
>- * Auxiliary Information - Certain failures may have information in
>- * addition to postcode information
>+ * # echo 1 > /sys/kernel/config/xe/0000:03:00.0/survivability_mode
>+ *
>+ * Refer :ref:`xe_configfs` for more details on how to use configfs
>+ *
>+ * Survivability mode is indicated by the below admin-only readable sysfs which provides additional
>+ * debug information
>+ * ::
>+ *
>+ * /sys/bus/pci/<device>/surivability_mode
/sys/bus/pci/devices/<device>/surivability_mode
Reviewed-by: Lucas De Marchi <lucas.demarchi at intel.com>
Lucas De Marchi
>+ *
>+ * Capability Information:
>+ * Provides boot status
>+ * Postcode Information:
>+ * Provides information about the failure
>+ * Overflow Information
>+ * Provides history of previous failures
>+ * Auxiliary Information
>+ * Certain failures may have information in addition to postcode information
> */
>
> static u32 aux_history_offset(u32 reg_value)
>--
>2.47.1
>
More information about the Intel-xe
mailing list