[PATCH v3] tests/intel/xe_fault_injection: Ignore all errors while injecting fault
Cavitt, Jonathan
jonathan.cavitt at intel.com
Wed Jun 4 17:09:47 UTC 2025
-----Original Message-----
From: Kamil Konieczny <kamil.konieczny at linux.intel.com>
Sent: Wednesday, June 4, 2025 10:04 AM
To: Cavitt, Jonathan <jonathan.cavitt at intel.com>
Cc: igt-dev at lists.freedesktop.org; Gupta, saurabhg <saurabhg.gupta at intel.com>; Zuo, Alex <alex.zuo at intel.com>; K V P, Satyanarayana <satyanarayana.k.v.p at intel.com>; Wajdeczko, Michal <Michal.Wajdeczko at intel.com>; Ceraolo Spurio, Daniele <daniele.ceraolospurio at intel.com>; De Marchi, Lucas <lucas.demarchi at intel.com>; Dugast, Francois <francois.dugast at intel.com>; Vivi, Rodrigo <rodrigo.vivi at intel.com>; Harrison, John C <john.c.harrison at intel.com>
Subject: Re: [PATCH v3] tests/intel/xe_fault_injection: Ignore all errors while injecting fault
>
> Hi Jonathan,
> On 2025-06-04 at 16:19:23 +0000, Jonathan Cavitt wrote:
> > From: Satyanarayana K V P <satyanarayana.k.v.p at intel.com>
> >
> > Currently, numerous fault messages have been included in the dmesg
> > ignore list, and this list continues to expand. Each time a new fault
> > injection point is introduced or a new feature is activated, additional
> > fault messages appear, making it cumbersome to manage the dmesg ignore
> > list.
> >
> > However, we can safely assert that all dmesg reports that contain
> > *ERROR* in their message can be ignored, so add them to the dmesg ignore
> > list. This unfortunately does not include the device probe error
> > itself, so that must be added separately.
> >
> > While we're here, we should also assert that any errors we see are only
> > coming from the target PCI device.
> >
> > v2:
> > - Only ignore error-level dmesg reports (or, at least, reports with
> > *ERROR* in them), and device probe failues
> > - Add PCI data to regex (Michal)
> >
> > v3: (Michal)
> > - Revert name change
> > - Add change log
> > - Remove fixes tag from commit
> > - Rename ignore_faults_in_dmesg to igt_ignore_dmesg_errors_from_dut, and
> > move to lib/igt_core.c
> > - Minor code fixes
> >
> > v4:
> > - Return ignore_faults_in_dmesg to tests/intel/xe_fault_injection.c, but
> > keep it renamed to ignore_dmesg_errors_from_dut (Kamil)
> >
> > Signed-off-by: Satyanarayana K V P <satyanarayana.k.v.p at intel.com>
> > Signed-off-by: Jonathan Cavitt <jonathan.cavitt at intel.com>
> > Suggested-by: Michal Wajdeczko <michal.wajdeczko at intel.com>
> > Suggested-by: Daniele Ceraolo Spurio <daniele.ceraolospurio at intel.com>
> > Suggested-by: Lucas De Marchi <lucas.demarchi at intel.com>
> > Cc: Francois Dugast <francois.dugast at intel.com>
> > Cc: Rodrigo Vivi <rodrigo.vivi at intel.com>
> > Cc: John Harrison <john.c.harrison at intel.com>
> > Cc: Kamil Konieczny <kamil.konieczny at linux.intel.com>
> > ---
> > lib/igt_core.c | 1 +
> > tests/intel/xe_fault_injection.c | 39 ++++++++++++--------------------
> > 2 files changed, 16 insertions(+), 24 deletions(-)
> >
> > diff --git a/lib/igt_core.c b/lib/igt_core.c
> > index b06cdfd894..ad70718b4a 100644
> > --- a/lib/igt_core.c
> > +++ b/lib/igt_core.c
> > @@ -76,6 +76,7 @@
> > #include "igt_rc.h"
> > #include "igt_list.h"
> > #include "igt_map.h"
> > +#include "igt_device.h"
>
> Please drop this change.
This was left in on accident. It's been removed now.
-Jonathan Cavitt
>
> Regards,
> Kamil
>
> > #include "igt_device_scan.h"
> > #include "igt_thread.h"
> > #include "igt_vec.h"
> > diff --git a/tests/intel/xe_fault_injection.c b/tests/intel/xe_fault_injection.c
> > index 9fe6bfe351..14aaeebf5e 100644
> > --- a/tests/intel/xe_fault_injection.c
> > +++ b/tests/intel/xe_fault_injection.c
> > @@ -64,28 +64,19 @@ static int fail_function_open(void)
> > return debugfs_fail_function_dir_fd;
> > }
> >
> > -static bool function_is_part_of_guc(const char function_name[])
> > +static void ignore_dmesg_errors_from_dut(int fd)
> > {
> > - return strstr(function_name, "_guc_") != NULL ||
> > - strstr(function_name, "_uc_") != NULL ||
> > - strstr(function_name, "_wopcm_") != NULL;
> > -}
> > -
> > -static void ignore_faults_in_dmesg(const char function_name[])
> > -{
> > - /* Driver probe is expected to fail in all cases, so ignore in igt_runner */
> > - char regex[1024] = "probe with driver xe failed with error -12";
> > -
> > /*
> > - * If GuC module fault is injected, GuC is expected to fail,
> > - * so also ignore GuC init failures in igt_runner.
> > + * Driver probe is expected to fail in all cases, so ignore in igt_runner.
> > + * Additionally, all error-level reports are expected, so ignore those as well.
> > */
> > - if (function_is_part_of_guc(function_name)) {
> > - strcat(regex, "|GT[0-9a-fA-F]*: GuC init failed with -ENOMEM");
> > - strcat(regex, "|GT[0-9a-fA-F]*: Failed to initialize uC .-ENOMEM");
> > - strcat(regex, "|GT[0-9a-fA-F]*: Failed to enable GuC CT .-ENOMEM");
> > - strcat(regex, "|GT[0-9a-fA-F]*: GuC PC query task state failed: -ENOMEM");
> > - }
> > + static const char *store = "probe with driver xe failed with error|\\*ERROR\\*";
> > + char pci_slot[NAME_MAX];
> > + char regex[1024];
> > +
> > + /* Only block dmesg reports that target the pci slot of the given fd */
> > + igt_device_get_pci_slot_name(fd, pci_slot);
> > + snprintf(regex, sizeof(regex), "%s:.*(%s)", pci_slot, store);
> >
> > igt_emit_ignore_dmesg_regex(regex);
> > }
> > @@ -234,7 +225,7 @@ inject_fault_probe(int fd, char pci_slot[], const char function_name[])
> > igt_info("Injecting error \"%s\" (%d) in function \"%s\"\n",
> > strerror(-INJECT_ERRNO), INJECT_ERRNO, function_name);
> >
> > - ignore_faults_in_dmesg(function_name);
> > + ignore_dmesg_errors_from_dut(fd);
> > injection_list_add(function_name);
> > set_retval(function_name, INJECT_ERRNO);
> >
> > @@ -299,7 +290,7 @@ exec_queue_create_fail(int fd, struct drm_xe_engine_class_instance *instance,
> > igt_assert_eq(__xe_exec_queue_create(fd, vm, 1, 1, instance, 0, &exec_queue_id), 0);
> > xe_exec_queue_destroy(fd, exec_queue_id);
> >
> > - ignore_faults_in_dmesg(function_name);
> > + ignore_dmesg_errors_from_dut(fd);
> > injection_list_add(function_name);
> > set_retval(function_name, INJECT_ERRNO);
> > igt_assert(__xe_exec_queue_create(fd, vm, 1, 1, instance, 0, &exec_queue_id) != 0);
> > @@ -334,7 +325,7 @@ vm_create_fail(int fd, const char function_name[], unsigned int flags)
> > {
> > igt_assert_eq(simple_vm_create(fd, flags), 0);
> >
> > - ignore_faults_in_dmesg(function_name);
> > + ignore_dmesg_errors_from_dut(fd);
> > injection_list_add(function_name);
> > set_retval(function_name, INJECT_ERRNO);
> > igt_assert(simple_vm_create(fd, flags) != 0);
> > @@ -397,7 +388,7 @@ vm_bind_fail(int fd, const char function_name[])
> >
> > igt_assert_eq(simple_vm_bind(fd, vm), 0);
> >
> > - ignore_faults_in_dmesg(function_name);
> > + ignore_dmesg_errors_from_dut(fd);
> > injection_list_add(function_name);
> > set_retval(function_name, INJECT_ERRNO);
> > igt_assert(simple_vm_bind(fd, vm) != 0);
> > @@ -445,7 +436,7 @@ oa_add_config_fail(int fd, int sysfs, int devid, const char function_name[])
> > igt_assert(igt_sysfs_scanf(sysfs, path, "%" PRIu64, &config_id) == 1);
> > igt_assert_eq(intel_xe_perf_ioctl(fd, DRM_XE_OBSERVATION_OP_REMOVE_CONFIG, &config_id), 0);
> >
> > - ignore_faults_in_dmesg(function_name);
> > + ignore_dmesg_errors_from_dut(fd);
> > injection_list_add(function_name);
> > set_retval(function_name, INJECT_ERRNO);
> > igt_assert_lt(intel_xe_perf_ioctl(fd, DRM_XE_OBSERVATION_OP_ADD_CONFIG, &config), 0);
> > --
> > 2.43.0
> >
>
More information about the igt-dev
mailing list