[PATCH 1/2] drm/xe: Introduce flag to indicate possible fault injection

John Harrison john.c.harrison at intel.com
Mon May 12 18:36:53 UTC 2025


On 5/12/2025 9:19 AM, Michal Wajdeczko wrote:
> When running some fault injection tests the driver might generate
> a lot of error logs which might unnecessary stress our CI systems.
>
> Introduce a flag exposed in debugfs that can be used by the fault
> injection tests to give the driver a hint to suppress non-essential
> error logs or dumps that might be otherwise generated.
Why use debugfs? The whole point of the fault injection test is that it 
replaces an existing function with something that just returns an error 
code. Seems the perfect way to have a function which is simply "return 
0;" in normal usage but modified by the injection test to return an 
error code when a test is running. Which is what the original patch was 
doing. That way everything is self contained, there are no extraneous 
interfaces in unrelated subsystems.

Unless you are wanting a more generic 'test_in_progress' function that 
is not specific to fault injection, then using debugfs seems like an 
unnecessary complication.

John.

>
> Signed-off-by: Michal Wajdeczko <michal.wajdeczko at intel.com>
> Cc: Satyanarayana K V P <satyanarayana.k.v.p at intel.com>
> Cc: John Harrison <john.c.harrison at intel.com>
> ---
>   drivers/gpu/drm/xe/xe_debugfs.c      |  5 +++++
>   drivers/gpu/drm/xe/xe_device.h       | 12 ++++++++++++
>   drivers/gpu/drm/xe/xe_device_types.h |  9 +++++++++
>   3 files changed, 26 insertions(+)
>
> diff --git a/drivers/gpu/drm/xe/xe_debugfs.c b/drivers/gpu/drm/xe/xe_debugfs.c
> index d0503959a8ed..0567a57597d3 100644
> --- a/drivers/gpu/drm/xe/xe_debugfs.c
> +++ b/drivers/gpu/drm/xe/xe_debugfs.c
> @@ -235,4 +235,9 @@ void xe_debugfs_register(struct xe_device *xe)
>   	xe_pxp_debugfs_register(xe->pxp);
>   
>   	fault_create_debugfs_attr("fail_gt_reset", root, &gt_reset_failure);
> +
> +#if IS_ENABLED(CONFIG_FAULT_INJECTION)
> +	debugfs_create_bool("fault_injection_in_progress", 0600, root,
> +			    &xe->fault_injection_in_progress);
> +#endif
>   }
> diff --git a/drivers/gpu/drm/xe/xe_device.h b/drivers/gpu/drm/xe/xe_device.h
> index 0bc3bc8e6803..ea25d8161050 100644
> --- a/drivers/gpu/drm/xe/xe_device.h
> +++ b/drivers/gpu/drm/xe/xe_device.h
> @@ -209,4 +209,16 @@ void xe_file_put(struct xe_file *xef);
>   #define LNL_FLUSH_WORK(wrk__) \
>   	flush_work(wrk__)
>   
> +#if IS_ENABLED(CONFIG_FAULT_INJECTION)
> +static inline bool xe_fault_injection_in_progress(struct xe_device *xe)
> +{
> +	return xe->fault_injection_in_progress;
> +}
> +#else
> +static inline bool xe_fault_injection_in_progress(struct xe_device *xe)
> +{
> +	return false;
> +}
> +#endif
> +
>   #endif
> diff --git a/drivers/gpu/drm/xe/xe_device_types.h b/drivers/gpu/drm/xe/xe_device_types.h
> index 06c65dace026..513a811a3121 100644
> --- a/drivers/gpu/drm/xe/xe_device_types.h
> +++ b/drivers/gpu/drm/xe/xe_device_types.h
> @@ -578,6 +578,15 @@ struct xe_device {
>   	u8 vm_inject_error_position;
>   #endif
>   
> +#if IS_ENABLED(CONFIG_FAULT_INJECTION)
> +	/**
> +	 * @fault_injection_in_progress: flag used by the fault injection
> +	 * tests to allow the driver to suppress non-essential error dumps
> +	 * that might be otherwise generated due to an injected fault.
> +	 */
> +	bool fault_injection_in_progress;
> +#endif
> +
>   	/* private: */
>   
>   #if IS_ENABLED(CONFIG_DRM_XE_DISPLAY)



More information about the Intel-xe mailing list