[PATCH i-g-t v6 2/3] tests/intel/xe_fault_injection: Inject errors in xe_guc_* calls

Cavitt, Jonathan jonathan.cavitt at intel.com
Tue Apr 15 14:31:17 UTC 2025


-----Original Message-----
From: K V P, Satyanarayana <satyanarayana.k.v.p at intel.com> 
Sent: Tuesday, April 15, 2025 1:27 AM
To: igt-dev at lists.freedesktop.org
Cc: K V P, Satyanarayana <satyanarayana.k.v.p at intel.com>; Wajdeczko, Michal <Michal.Wajdeczko at intel.com>; Dugast, Francois <francois.dugast at intel.com>; Laguna, Lukasz <lukasz.laguna at intel.com>; De Marchi, Lucas <lucas.demarchi at intel.com>; Cavitt, Jonathan <jonathan.cavitt at intel.com>
Subject: [PATCH i-g-t v6 2/3] tests/intel/xe_fault_injection: Inject errors in xe_guc_* calls
> 
> Use the kernel fault injection infrastructure to test error handling
> of xe during driver probe when executing xe_guc_ct_send_recv() /
> xe_guc_mmio_send_recv() so that more code paths are tested, such as
> error handling and unwinding.
> 
> All xe_init() kind of functions are called just once during driver probe,
> so it is sufficient to fail first/all calls to them. Driver communicates
> with the GuC multiple times, and the real failure can happen at different
> call, hence the need to inject failure in GuC communication functions,
> like guc_mmio_send() or guc_ct_send(), but it can't be just first call or
> all calls, but we need to be able to select specific iteration to fail.
> 
> To address this problem, an optional input argument is introduced. If the
> argument is not set, an error will be injected in every possible function
> call starting from first up to the max number of iteration defined by
> INJECT_ITERATIONS, currently hardcoded as 100. If the input argument is
> set, an error can be injected at specific function call.
> 
> Error can be injected using:
> igt at xe_fault_injection@probe-fail-guc-xe_guc_ct_send_recv
> igt at xe_fault_injection@probe-fail-guc-xe_guc_mmio_send_recv
> 
> Signed-off-by: Satyanarayana K V P <satyanarayana.k.v.p at intel.com>

This patch looks good to me, though I now have a note on the first
patch in this series.

Reviewed-by: Jonathan Cavitt <jonathan.cavitt at intel.com>
-Jonathan Cavitt

> ---
> Cc: Michał Wajdeczko <michal.wajdeczko at intel.com>
> Cc: Francois Dugast <francois.dugast at intel.com>
> Cc: Lukasz Laguna <lukasz.laguna at intel.com>
> Cc: Lucas De Marchi <lucas.demarchi at intel.com>
> Cc: Cavitt Jonathan <jonathan.cavitt at intel.com>
> Reviewed-by: Lucas De Marchi <lucas.demarchi at intel.com>
> ---
>  tests/intel/xe_fault_injection.c | 74 +++++++++++++++++++++++++++++++-
>  1 file changed, 73 insertions(+), 1 deletion(-)
> 
> diff --git a/tests/intel/xe_fault_injection.c b/tests/intel/xe_fault_injection.c
> index e56cdb304..252209308 100644
> --- a/tests/intel/xe_fault_injection.c
> +++ b/tests/intel/xe_fault_injection.c
> @@ -26,7 +26,9 @@
>  #define INJECT_ERRNO	-ENOMEM
>  #define BO_ADDR		0x1a0000
>  #define BO_SIZE		(1024*1024)
> +#define INJECT_ITERATIONS	100
>  
> +int32_t inject_iters_raw;
>  struct fault_injection_params {
>  	/* @probability: Likelihood of failure injection, in percent. */
>  	uint32_t probability;
> @@ -234,6 +236,38 @@ inject_fault_probe(int fd, char pci_slot[], const char function_name[])
>  	injection_list_remove(function_name);
>  }
>  
> +/**
> + * SUBTEST: probe-fail-guc-%s
> + * Description: inject an error in the injectable function %arg[1] then reprobe driver
> + * Functionality: fault
> + *
> + * arg[1]:
> + * @xe_guc_mmio_send_recv:     Inject an error when calling xe_guc_mmio_send_recv
> + * @xe_guc_ct_send_recv:       Inject an error when calling xe_guc_ct_send_recv
> + */
> +static void probe_fail_guc(int fd, char pci_slot[], const char function_name[],
> +               struct fault_injection_params *fault_params)
> +{
> +	int iter_start = 0, iter_end = 0, iter = 0;
> +
> +	igt_assert(fault_params);
> +
> +	/* inject_iters_raw will have zero if unset / set to <=0 or malformed.
> +	   When set to > 0 it will have iteration number and will run single n-th
> +	   iteration only.
> +	*/
> +	iter = inject_iters_raw;
> +	iter_start = iter ? : 0;
> +	iter_end = iter ? iter + 1 : INJECT_ITERATIONS;
> +	igt_debug("Injecting error for %d - %d iterations\n", iter_start, iter_end);
> +	for (int i = iter_start; i < iter_end; i++) {
> +		fault_params->space = i;
> +		setup_injection_fault(fault_params);
> +		inject_fault_probe(fd, pci_slot, function_name);
> +		igt_kmod_unbind("xe", pci_slot);
> +	}
> +}
> +
>  /**
>   * SUBTEST: exec-queue-create-fail-%s
>   * Description: inject an error in function %arg[1] used in exec queue create IOCTL to make it fail
> @@ -408,10 +442,35 @@ oa_add_config_fail(int fd, int sysfs, int devid, const char function_name[])
>  	igt_assert_eq(intel_xe_perf_ioctl(fd, DRM_XE_OBSERVATION_OP_REMOVE_CONFIG, &config_id), 0);
>  }
>  
> -igt_main
> +static int opt_handler(int opt, int opt_index, void *data)
> +{
> +	int in_param;
> +	switch (opt) {
> +	case 'I':
> +		/* Update to 0 if not exported / -ve value */
> +		in_param = atoi(optarg);
> +		if (!in_param || in_param <= 0 || in_param > INJECT_ITERATIONS)
> +			inject_iters_raw = 0;
> +		else
> +			inject_iters_raw = in_param;
> +		break;
> +	default:
> +		return IGT_OPT_HANDLER_ERROR;
> +	}
> +
> +	return IGT_OPT_HANDLER_SUCCESS;
> +}
> +
> +const char *help_str =
> +	"  -I\tIf set, an error will be injected at specific function call.\n\
> +	If not set, an error will be injected in every possible function call\
> +	starting from first up to 100.";
> +
> +igt_main_args("I:", NULL, help_str, opt_handler, NULL)
>  {
>  	int fd, sysfs;
>  	struct drm_xe_engine_class_instance *hwe;
> +	struct fault_injection_params fault_params;
>  	static uint32_t devid;
>  	char pci_slot[NAME_MAX];
>  	const struct section {
> @@ -470,6 +529,12 @@ igt_main
>  		{ }
>  	};
>  
> +	const struct section guc_fail_functions[] = {
> +		{ "xe_guc_mmio_send_recv" },
> +		{ "xe_guc_ct_send_recv" },
> +		{ }
> +	};
> +
>  	igt_fixture {
>  		igt_require(fail_function_injection_enabled());
>  		fd = drm_open_driver(DRIVER_XE);
> @@ -512,6 +577,13 @@ igt_main
>  		igt_subtest_f("inject-fault-probe-function-%s", s->name)
>  			inject_fault_probe(fd, pci_slot, s->name);
>  
> +	for (const struct section *s = guc_fail_functions; s->name; s++)
> +		igt_subtest_f("probe-fail-guc-%s", s->name) {
> +			memcpy(&fault_params, &default_fault_params,
> +					sizeof(struct fault_injection_params));
> +			probe_fail_guc(fd, pci_slot, s->name, &fault_params);
> +		}
> +
>  	igt_fixture {
>  		close(sysfs);
>  		drm_close_driver(fd);
> -- 
> 2.43.0
> 
> 


More information about the igt-dev mailing list