[PATCH i-g-t v4 2/2] tests/intel/xe_fault_injection: Inject errors during xe_guc_ct_send_recv
Michal Wajdeczko
michal.wajdeczko at intel.com
Wed Jan 22 15:22:26 UTC 2025
Hi,
late comments cont'd
On 22.01.2025 08:38, Satyanarayana K V P wrote:
> Use the kernel fault injection infrastructure to test error handling
> of xe at enabling of VFs stage when executing xe_guc_ct_send_recv()
> so that more code paths are tested, such as error handling and unwinding.
why xe_guc_ct_send_recv() was not added to the probe_fail_functions[]
like it was done with xe_guc_mmio_send_recv() in patch 1/2 ?
also maybe it's worth to clearly say that 'enabling of VFs' is just a
provisioning step on the PF, without probing any VF driver
and if the goal of the test-case is to validate a 'enabling of VFs'
scenario, why we don't add some other critical functions used during
that phase, like GGTT/doorbells/LMEM allocations?
>
> Error can be injected using:
> igt at xe_fault_injection@enable-vfs-fail-xe_guc_ct_send_recv
>
> v2: Updated guc_fail_* to enable_vfs_*
> Added igt_skip_on(!igt_sriov_is_pf(fd)) to skip test when run without
> enabling sriov.
>
> v3: Fixed documentation build error
> ERROR: Missing documentation for igt at xe_fault_injection@enable-vfs-fail-xe_guc_ct_send_recv
> ERROR: Unneeded documentation for igt at xe_fault_injection@guc-fail-xe_guc_ct_send_recv
>
> v4: Fixed review comments.
> Updated igt_skip_on to igt_require.
>
> Cc: Matthew Brost <matthew.brost at intel.com>
> Cc: Michał Wajdeczko <michal.wajdeczko at intel.com>
> Cc: Francois Dugast <francois.dugast at intel.com>
> Signed-off-by: Satyanarayana K V P <satyanarayana.k.v.p at intel.com>
> Reviewed-by: Francois Dugast <francois.dugast at intel.com>
> Reviewed-by: Marcin Bernatowicz <marcin.bernatowicz at linux.intel.com>
> ---
> tests/intel/xe_fault_injection.c | 62 ++++++++++++++++++++++++++++++++
> 1 file changed, 62 insertions(+)
>
> diff --git a/tests/intel/xe_fault_injection.c b/tests/intel/xe_fault_injection.c
> index 3a0e2aa29..f92f955cd 100644
> --- a/tests/intel/xe_fault_injection.c
> +++ b/tests/intel/xe_fault_injection.c
> @@ -19,12 +19,14 @@
> #include "igt_sysfs.h"
> #include "lib/igt_syncobj.h"
> #include "lib/intel_pat.h"
> +#include "lib/igt_sriov_device.h"
> #include "xe/xe_ioctl.h"
> #include "xe/xe_query.h"
>
> #define INJECT_ERRNO -ENOMEM
> #define BO_ADDR 0x1a0000
> #define BO_SIZE (1024*1024)
> +#define NUM_VFS 1
>
> enum injection_list_action {
> INJECTION_LIST_ADD,
> @@ -281,6 +283,55 @@ vm_bind_fail(int fd, const char function_name[])
> igt_assert_eq(simple_vm_bind(fd, vm), 0);
> }
>
> +static int sriov_enable_vfs(int fd, int num_vfs)
looks like a good candidate to be moved to lib/
see below
> +{
> + int sysfs;
> + bool ret;
> +
> + sysfs = igt_sysfs_open(fd);
> + igt_assert_fd(sysfs);
> +
> + ret = __igt_sysfs_set_u32(sysfs, "device/sriov_numvfs", num_vfs);
> + close(sysfs);
> +
> + return ret;
> +}
> +
> +/**
> + * SUBTEST: enable-vfs-fail-%s
> + * Description: inject an error in function %arg[1] used when xe interacts with guc to make it fail
> + * Functionality: fault
> + *
> + * arg[1]:
> + * @xe_guc_ct_send_recv: xe_guc_ct_send_recv
> + */
> +
> +static void
> +enable_vfs_fail(int fd, int num_vfs, const char function_name[])
> +{
> + bool autoprobe = 0;
> +
> + ignore_faults_in_dmesg(function_name);
> + injection_list_do(INJECTION_LIST_ADD, function_name);
> + set_retval(function_name, INJECT_ERRNO);
> +
> + autoprobe = igt_sriov_is_driver_autoprobe_enabled(fd);
> +
> + if (autoprobe)
> + igt_sriov_disable_driver_autoprobe(fd);
> +
> + /* igt_sriov_enable_vfs can't be used here as it is causing abort on any error.
> + * Since error in this test is expected, we have written our own static function here.
> + */
hmm, if igt_sriov_enable_vfs() can't be used as-is, then instead of
adding local static function here, better option would be to add to the
lib something like __igt_sriov_enable_vfs() that will not abort
then we will be able to keep a 'VFs enabling logic' in one place
> + sriov_enable_vfs(fd, num_vfs);
> +
> + igt_assert_eq(-errno, INJECT_ERRNO);
> + injection_list_do(INJECTION_LIST_REMOVE, function_name);
> +
> + if (autoprobe)
> + igt_sriov_enable_driver_autoprobe(fd);
> +}
> +
> igt_main
> {
> int fd;
> @@ -319,6 +370,10 @@ igt_main
> { "xe_vma_ops_alloc" },
> { }
> };
> + const struct section enable_vfs_fail_functions[] = {
> + { "xe_guc_ct_send_recv" },
what about adding failures at:
xe_ggtt_node_insert
xe_bo_create_pin_map
xe_guc_id_mgr_reserve
xe_guc_db_mgr_reserve_range
> + { }
> + };
>
> igt_fixture {
> igt_require(fail_function_injection_enabled());
> @@ -335,6 +390,13 @@ igt_main
> igt_subtest_f("vm-bind-fail-%s", s->name)
> vm_bind_fail(fd, s->name);
>
> + for (const struct section *s = enable_vfs_fail_functions; s->name; s++)
> + igt_subtest_f("enable-vfs-fail-%s", s->name) {
> + /* Skip the test if not running with SRIOV */
> + igt_require(igt_sriov_is_pf(fd));
shouldn't we just try to *not* define this subtest if not being a PF?
something like:
if (igt_sriov_is_pf(fd))
for_each_section(s, enable_vfs_fail_functions)
igt_subtest_f("enable-vfs-fail-%s", s->name)
> + enable_vfs_fail(fd, NUM_VFS, s->name);
> + }
> +
> igt_fixture {
> xe_sysfs_driver_do(fd, pci_slot, XE_SYSFS_DRIVER_UNBIND);
> }
More information about the igt-dev
mailing list