[PATCH i-g-t v2 4/5] tests/intel/xe_fault_injection: Inject errors during xe_guc_ct_send_recv & xe_guc_mmio_send_recv.
K V P, Satyanarayana
satyanarayana.k.v.p at intel.com
Wed Mar 5 11:07:37 UTC 2025
Hi
> -----Original Message-----
> From: Dugast, Francois <francois.dugast at intel.com>
> Sent: Wednesday, March 5, 2025 4:21 PM
> To: K V P, Satyanarayana <satyanarayana.k.v.p at intel.com>
> Cc: igt-dev at lists.freedesktop.org; Wajdeczko, Michal
> <Michal.Wajdeczko at intel.com>
> Subject: Re: [PATCH i-g-t v2 4/5] tests/intel/xe_fault_injection: Inject errors
> during xe_guc_ct_send_recv & xe_guc_mmio_send_recv.
>
> On Wed, Feb 19, 2025 at 01:04:44PM +0530, Satyanarayana K V P wrote:
> > Use the kernel fault injection infrastructure to test error handling
> > of xe during driver probe when executing xe_guc_ct_send_recv() /
> > xe_guc_mmio_send_recv() so that more code paths are tested, such as
> > error handling and unwinding.
> >
> > Error can be injected using:
> > igt at xe_fault_injection@probe-fail-guc-xe_guc_ct_send_recv
> > igt at xe_fault_injection@probe-fail-guc-xe_guc_mmio_send_recv
> >
> > Signed-off-by: Satyanarayana K V P <satyanarayana.k.v.p at intel.com>
> > Cc: Michał Wajdeczko <michal.wajdeczko at intel.com>
> > Cc: Francois Dugast <francois.dugast at intel.com>
> > ---
> > tests/intel/xe_fault_injection.c | 41
> ++++++++++++++++++++++++++++++++
> > 1 file changed, 41 insertions(+)
> >
> > diff --git a/tests/intel/xe_fault_injection.c b/tests/intel/xe_fault_injection.c
> > index 32733fec5..cf0337733 100644
> > --- a/tests/intel/xe_fault_injection.c
> > +++ b/tests/intel/xe_fault_injection.c
> > @@ -231,6 +231,34 @@ inject_fault_probe(int fd, char pci_slot[], const
> char function_name[])
> > injection_list_do(INJECTION_LIST_REMOVE, function_name);
> > }
> >
> > +/**
> > + * SUBTEST: probe-fail-guc-%s
> > + * Description: inject an error in the injectable function %arg[1] then
> reprobe driver
> > + * Functionality: fault
> > + *
> > + * arg[1]:
> > + * @xe_guc_mmio_send_recv: Inject an error when calling
> xe_guc_mmio_send_recv
> > + * @xe_guc_ct_send_recv: Inject an error when calling
> xe_guc_ct_send_recv
> > + */
> > +
> > +static void probe_fail_guc(int fd, char pci_slot[], const char
> function_name[],
> > + struct fault_injection_params *fault_params)
> > +{
> > + int iter_start = 0, iter_end = 0, iter = 0;
> > +
> > + igt_assert(fault_params);
> > +
> > + /* Get the iteration count from environment */
> > + iter = get_fault_inject_iter();
> > + iter_start = iter ? : 0;
> > + iter_end = iter ? iter + 1 : INJECT_ITERATIONS;
>
> On CI, IGT_FAULT_INJECT_ITERATION will not be set so iter_end will be
> 100. Probing xe 100 times would be quite time consuming. What is the
> expectation for this test, what would be the added value?
>
> Francois
>
Michal explained use of this environmental variable in previous patch.
Please find details below.
"Because all xe_init() kind of functions are called just once during
driver probe, so it is sufficient to fail first/all calls to them.
OTOH driver communicates with the GuC multiple times, and the real
failure can happen at different call, hence the need to inject failure
in GuC communication functions, like guc_mmio_send() or guc_ct_send(),
but it can't be just first call or all calls, but we need to be able to
select specific iteration to fail.
Only with that approach we can provide sufficient coverage of the driver
probe related to GuC."
So, if IGT_FAULT_INJECT_ITERATION is not set, error will be injected for 1-100
specific iterations for better coverage.
> > + for (int i = iter_start; i < iter_end; i++) {
> > + fault_params->space = i;
> > + setup_injection_fault(fault_params);
> > + inject_fault_probe(fd, pci_slot, function_name);
> > + }
> > +}
> > +
> > static int
> > simple_vm_create(int fd, unsigned int flags)
> > {
> > @@ -330,6 +358,7 @@ igt_main
> > {
> > int fd;
> > char pci_slot[NAME_MAX];
> > + struct fault_injection_params fault_params;
> > const struct section {
> > const char *name;
> > unsigned int flags;
> > @@ -363,6 +392,11 @@ igt_main
> > { "xe_vma_ops_alloc" },
> > { }
> > };
> > + const struct section guc_fail_functions[] = {
> > + { "xe_guc_mmio_send_recv" },
> > + { "xe_guc_ct_send_recv" },
> > + { }
> > + };
> >
> > igt_fixture {
> > igt_require(fail_function_injection_enabled());
> > @@ -387,6 +421,13 @@ igt_main
> > igt_subtest_f("inject-fault-probe-function-%s", s->name)
> > inject_fault_probe(fd, pci_slot, s->name);
> >
> > + for (const struct section *s = guc_fail_functions; s->name; s++)
> > + igt_subtest_f("probe-fail-guc-%s", s->name) {
> > + memcpy(&fault_params, &default_fault_params,
> > + sizeof(struct fault_injection_params));
> > + probe_fail_guc(fd, pci_slot, s->name, &fault_params);
> > + }
> > +
> > igt_fixture {
> > drm_close_driver(fd);
> > xe_sysfs_driver_do(fd, pci_slot, XE_SYSFS_DRIVER_BIND);
> > --
> > 2.35.3
> >
More information about the igt-dev
mailing list