[i-g-t,v4,2/3] tests/intel/xe_fault_injection: Inject errors during xe_guc_ct_send_recv & xe_guc_mmio_send_recv.

K V P, Satyanarayana satyanarayana.k.v.p at intel.com
Fri Apr 4 06:04:59 UTC 2025


> From: Laguna, Lukasz <lukasz.laguna at intel.com>
> Sent: Thursday, April 3, 2025 7:05 PM
> To: K V P, Satyanarayana <satyanarayana.k.v.p at intel.com>; igt-
> dev at lists.freedesktop.org
> Cc: De Marchi, Lucas <lucas.demarchi at intel.com>; Wajdeczko, Michal
> <Michal.Wajdeczko at intel.com>; Dugast, Francois
> <francois.dugast at intel.com>
> Subject: Re: [i-g-t,v4,2/3] tests/intel/xe_fault_injection: Inject errors during
> xe_guc_ct_send_recv & xe_guc_mmio_send_recv.
> 
> 
> On 3/28/2025 12:15, Satyanarayana K V P wrote:
> > Use the kernel fault injection infrastructure to test error handling
> > of xe during driver probe when executing xe_guc_ct_send_recv() /
> > xe_guc_mmio_send_recv() so that more code paths are tested, such as
> > error handling and unwinding.
> >
> > All xe_init() kind of functions are called just once during driver probe,
> > so it is sufficient to fail first/all calls to them. Driver communicates
> > with the GuC multiple times, and the real failure can happen at different
> > call, hence the need to inject failure in GuC communication functions,
> > like guc_mmio_send() or guc_ct_send(), but it can't be just first call or
> > all calls, but we need to be able to select specific iteration to fail.
> >
> > To address this problem, the environmental variable
> IGT_FAULT_INJECT_ITERATION
> 
> I think it'd be better to use test parameter instead of environment
> variable. Please check igt_main_args() and consider using it.
>
Hi Lukasz,
 The intention of test is to inject error 0-100 iterations for guc_ct and guc_mmio functions.
The environment variable is used to inject error at specific iteration in case we want to debug 
Issue in case of any failure. So, not used test parameter.

-Satya.

 
> > is used. If the IGT_FAULT_INJECT_ITERATION is not exported, an error will
> > be injected in every possible function call starting from first up to the
> > max number of iteration defined by INJECT_ITERATIONS, currently
> hardcoded
> > as 100. Also, using IGT_FAULT_INJECT_ITERATION, an error can be injected
> at
> > specific function call.
> >
> > Error can be injected using:
> > igt at xe_fault_injection@probe-fail-guc-xe_guc_ct_send_recv
> > igt at xe_fault_injection@probe-fail-guc-xe_guc_mmio_send_recv
> >
> > Signed-off-by: Satyanarayana K V P <satyanarayana.k.v.p at intel.com>
> > Cc: MichaƂ Wajdeczko <michal.wajdeczko at intel.com>
> > Cc: Francois Dugast <francois.dugast at intel.com>
> > Reviewed-by: Lucas De Marchi <lucas.demarchi at intel.com>
> > ---
> >   tests/intel/xe_fault_injection.c | 61
> ++++++++++++++++++++++++++++++++
> >   1 file changed, 61 insertions(+)
> >
> > diff --git a/tests/intel/xe_fault_injection.c b/tests/intel/xe_fault_injection.c
> > index 1325a1716..a49070b4d 100644
> > --- a/tests/intel/xe_fault_injection.c
> > +++ b/tests/intel/xe_fault_injection.c
> > @@ -26,6 +26,7 @@
> >   #define INJECT_ERRNO	-ENOMEM
> >   #define BO_ADDR		0x1a0000
> >   #define BO_SIZE		(1024*1024)
> > +#define INJECT_ITERATIONS	100
> >
> >   enum injection_list_action {
> >   	INJECTION_LIST_ADD,
> > @@ -43,6 +44,24 @@ struct fault_injection_params {
> >   	uint32_t space;
> >   };
> >
> > +/**
> > + *  Introduce a new environmental variable IGT_FAULT_INJECT_ITERATION
> > + *  using which an error can be injected at specific function call.
> > + *  When unset test will run for INJECT_ITERATIONS iterations.
> > + *  When set to <=0 or malformed - same as unset.
> > + *  When set to >0 it will run single n-th iteration only.
> > + */
> > +static int get_fault_inject_iter(void)
> > +{
> > +	const char *env = getenv("IGT_FAULT_INJECT_ITERATION");
> > +
> > +	/* Return 0 if not exported / -ve value */
> > +	if (!env || atoi(env) <= 0)
> > +		return 0;
> > +
> > +	return atoi(env);
> > +}
> > +
> >   static int fail_function_open(void)
> >   {
> >   	int debugfs_fail_function_dir_fd;
> > @@ -228,6 +247,34 @@ inject_fault_probe(int fd, char pci_slot[], const
> char function_name[])
> >   	injection_list_do(INJECTION_LIST_REMOVE, function_name);
> >   }
> >
> > +/**
> > + * SUBTEST: probe-fail-guc-%s
> > + * Description: inject an error in the injectable function %arg[1] then
> reprobe driver
> > + * Functionality: fault
> > + *
> > + * arg[1]:
> > + * @xe_guc_mmio_send_recv:     Inject an error when calling
> xe_guc_mmio_send_recv
> > + * @xe_guc_ct_send_recv:       Inject an error when calling
> xe_guc_ct_send_recv
> > + */
> > +static void probe_fail_guc(int fd, char pci_slot[], const char
> function_name[],
> > +               struct fault_injection_params *fault_params)
> > +{
> > +	int iter_start = 0, iter_end = 0, iter = 0;
> > +
> > +	igt_assert(fault_params);
> > +
> > +	/* Get the iteration count from environment */
> > +	iter = get_fault_inject_iter();
> > +	iter_start = iter ? : 0;
> 
> Can't it be just iter_start = iter; ?
> 
> > +	iter_end = iter ? iter + 1 : INJECT_ITERATIONS;
> > +	for (int i = iter_start; i < iter_end; i++) {
> > +		fault_params->space = i;
> > +		setup_injection_fault(fault_params);
> > +		inject_fault_probe(fd, pci_slot, function_name);
> > +		igt_kmod_unbind("xe", pci_slot);
> > +	}
> > +}
> > +
> >   /**
> >    * SUBTEST: exec-queue-create-fail-%s
> >    * Description: inject an error in function %arg[1] used in exec queue create
> IOCTL to make it fail
> > @@ -406,6 +453,7 @@ igt_main
> >   {
> >   	int fd, sysfs;
> >   	struct drm_xe_engine_class_instance *hwe;
> > +	struct fault_injection_params fault_params;
> >   	static uint32_t devid;
> >   	char pci_slot[NAME_MAX];
> >   	const struct section {
> > @@ -463,6 +511,12 @@ igt_main
> >   		{ }
> >   	};
> >
> > +	const struct section guc_fail_functions[] = {
> > +		{ "xe_guc_mmio_send_recv" },
> > +		{ "xe_guc_ct_send_recv" },
> > +		{ }
> > +	};
> > +
> >   	igt_fixture {
> >   		igt_require(fail_function_injection_enabled());
> >   		fd = drm_open_driver(DRIVER_XE);
> > @@ -505,6 +559,13 @@ igt_main
> >   		igt_subtest_f("inject-fault-probe-function-%s", s->name)
> >   			inject_fault_probe(fd, pci_slot, s->name);
> >
> > +   for (const struct section *s = guc_fail_functions; s->name; s++)
> > +       igt_subtest_f("probe-fail-guc-%s", s->name) {
> > +           memcpy(&fault_params, &default_fault_params,
> > +                   sizeof(struct fault_injection_params));
> > +           probe_fail_guc(fd, pci_slot, s->name, &fault_params);
> > +       }
> > +
> >   	igt_fixture {
> >   		close(sysfs);
> >   		drm_close_driver(fd);


More information about the igt-dev mailing list