[PATCH i-g-t v4 2/2] tests/intel/xe_fault_injection: Inject errors during xe_guc_ct_send_recv

Michal Wajdeczko michal.wajdeczko at intel.com
Wed Jan 22 15:22:26 UTC 2025


Hi,

late comments cont'd

On 22.01.2025 08:38, Satyanarayana K V P wrote:
> Use the kernel fault injection infrastructure to test error handling
> of xe at enabling of VFs stage when executing xe_guc_ct_send_recv()
> so that more code paths are tested, such as error handling and unwinding.

why xe_guc_ct_send_recv() was not added to the probe_fail_functions[]
like it was done with xe_guc_mmio_send_recv() in patch 1/2 ?

also maybe it's worth to clearly say that 'enabling of VFs' is just a
provisioning step on the PF, without probing any VF driver

and if the goal of the test-case is to validate a 'enabling of VFs'
scenario, why we don't add some other critical functions used during
that phase, like GGTT/doorbells/LMEM allocations?

> 
> Error can be injected using:
> igt at xe_fault_injection@enable-vfs-fail-xe_guc_ct_send_recv
> 
> v2: Updated guc_fail_* to enable_vfs_*
> Added igt_skip_on(!igt_sriov_is_pf(fd)) to skip test when run without
> enabling sriov.
> 
> v3: Fixed documentation build error
> ERROR: Missing documentation for igt at xe_fault_injection@enable-vfs-fail-xe_guc_ct_send_recv
> ERROR: Unneeded documentation for igt at xe_fault_injection@guc-fail-xe_guc_ct_send_recv
> 
> v4: Fixed review comments.
> Updated igt_skip_on to igt_require.
> 
> Cc: Matthew Brost <matthew.brost at intel.com>
> Cc: Michał Wajdeczko <michal.wajdeczko at intel.com>
> Cc: Francois Dugast <francois.dugast at intel.com>
> Signed-off-by: Satyanarayana K V P <satyanarayana.k.v.p at intel.com>
> Reviewed-by: Francois Dugast <francois.dugast at intel.com>
> Reviewed-by: Marcin Bernatowicz <marcin.bernatowicz at linux.intel.com>
> ---
>  tests/intel/xe_fault_injection.c | 62 ++++++++++++++++++++++++++++++++
>  1 file changed, 62 insertions(+)
> 
> diff --git a/tests/intel/xe_fault_injection.c b/tests/intel/xe_fault_injection.c
> index 3a0e2aa29..f92f955cd 100644
> --- a/tests/intel/xe_fault_injection.c
> +++ b/tests/intel/xe_fault_injection.c
> @@ -19,12 +19,14 @@
>  #include "igt_sysfs.h"
>  #include "lib/igt_syncobj.h"
>  #include "lib/intel_pat.h"
> +#include "lib/igt_sriov_device.h"
>  #include "xe/xe_ioctl.h"
>  #include "xe/xe_query.h"
>  
>  #define INJECT_ERRNO	-ENOMEM
>  #define BO_ADDR		0x1a0000
>  #define BO_SIZE		(1024*1024)
> +#define NUM_VFS		1
>  
>  enum injection_list_action {
>  	INJECTION_LIST_ADD,
> @@ -281,6 +283,55 @@ vm_bind_fail(int fd, const char function_name[])
>  	igt_assert_eq(simple_vm_bind(fd, vm), 0);
>  }
>  
> +static int sriov_enable_vfs(int fd, int num_vfs)

looks like a good candidate to be moved to lib/
see below

> +{
> +	int sysfs;
> +	bool ret;
> +
> +	sysfs = igt_sysfs_open(fd);
> +	igt_assert_fd(sysfs);
> +
> +	ret = __igt_sysfs_set_u32(sysfs, "device/sriov_numvfs", num_vfs);
> +	close(sysfs);
> +
> +	return ret;
> +}
> +
> +/**
> + * SUBTEST: enable-vfs-fail-%s
> + * Description: inject an error in function %arg[1] used when xe interacts with guc to make it fail
> + * Functionality: fault
> + *
> + * arg[1]:
> + * @xe_guc_ct_send_recv:               xe_guc_ct_send_recv
> + */
> +
> +static void
> +enable_vfs_fail(int fd, int num_vfs, const char function_name[])
> +{
> +	bool autoprobe = 0;
> +
> +	ignore_faults_in_dmesg(function_name);
> +	injection_list_do(INJECTION_LIST_ADD, function_name);
> +	set_retval(function_name, INJECT_ERRNO);
> +
> +	autoprobe = igt_sriov_is_driver_autoprobe_enabled(fd);
> +
> +	if (autoprobe)
> +		igt_sriov_disable_driver_autoprobe(fd);
> +
> +	/* igt_sriov_enable_vfs can't be used here as it is causing abort on any error.
> +	 * Since error in this test is expected, we have written our own static function here.
> +	 */

hmm, if igt_sriov_enable_vfs() can't be used as-is, then instead of
adding local static function here, better option would be to add to the
lib something like __igt_sriov_enable_vfs() that will not abort

then we will be able to keep a 'VFs enabling logic' in one place

> +	sriov_enable_vfs(fd, num_vfs);
> +
> +	igt_assert_eq(-errno, INJECT_ERRNO);
> +	injection_list_do(INJECTION_LIST_REMOVE, function_name);
> +
> +	if (autoprobe)
> +		igt_sriov_enable_driver_autoprobe(fd);
> +}
> +
>  igt_main
>  {
>  	int fd;
> @@ -319,6 +370,10 @@ igt_main
>  		{ "xe_vma_ops_alloc" },
>  		{ }
>  	};
> +	const struct section enable_vfs_fail_functions[] = {
> +		{ "xe_guc_ct_send_recv" },

what about adding failures at:

	xe_ggtt_node_insert
	xe_bo_create_pin_map
	xe_guc_id_mgr_reserve
	xe_guc_db_mgr_reserve_range

> +		{ }
> +	};
>  
>  	igt_fixture {
>  		igt_require(fail_function_injection_enabled());
> @@ -335,6 +390,13 @@ igt_main
>  		igt_subtest_f("vm-bind-fail-%s", s->name)
>  			vm_bind_fail(fd, s->name);
>  
> +	for (const struct section *s = enable_vfs_fail_functions; s->name; s++)
> +		igt_subtest_f("enable-vfs-fail-%s", s->name) {
> +			/* Skip the test if not running with SRIOV */
> +			igt_require(igt_sriov_is_pf(fd));

shouldn't we just try to *not* define this subtest if not being a PF?
something like:

	if (igt_sriov_is_pf(fd))
		for_each_section(s, enable_vfs_fail_functions)
			igt_subtest_f("enable-vfs-fail-%s", s->name)

> +			enable_vfs_fail(fd, NUM_VFS, s->name);
> +		}
> +
>  	igt_fixture {
>  		xe_sysfs_driver_do(fd, pci_slot, XE_SYSFS_DRIVER_UNBIND);
>  	}



More information about the igt-dev mailing list