[PATCH v3] tests/intel/xe_fault_injection: Ignore expected errors

Cavitt, Jonathan jonathan.cavitt at intel.com
Wed Nov 20 19:28:02 UTC 2024


-----Original Message-----
From: Kamil Konieczny <kamil.konieczny at linux.intel.com> 
Sent: Wednesday, November 20, 2024 10:50 AM
To: igt-dev at lists.freedesktop.org
Cc: Cavitt, Jonathan <jonathan.cavitt at intel.com>; Gupta, saurabhg <saurabhg.gupta at intel.com>; Zuo, Alex <alex.zuo at intel.com>; Dugast, Francois <francois.dugast at intel.com>; De Marchi, Lucas <lucas.demarchi at intel.com>; Brost, Matthew <matthew.brost at intel.com>; Vivi, Rodrigo <rodrigo.vivi at intel.com>; Wajdeczko, Michal <Michal.Wajdeczko at intel.com>
Subject: Re: [PATCH v3] tests/intel/xe_fault_injection: Ignore expected errors
> 
> Hi Jonathan,
> On 2024-11-19 at 21:50:27 +0000, Jonathan Cavitt wrote:
> > The following errors can be observed when running the xe_fault_injection
> > subtests:
> > 
> > [drm] *ERROR* GT0: GuC init failed with -ENOMEM
> > [drm] *ERROR* GT0: Failed to initialize uC (-ENOMEM)
> > probe with driver xe failed with error -12
> > 
> > Add these messages to the dmesg ignore regex to the applicable tests
> > (specifically, all tests for the last error, and all tests that target
> > GuC subsystems for the first two errors).
> > 
> > v2:
> > - Fix and merge regex (Kamil)
> > 
> > v3:
> > - Rebase change to be compatible with latest revision (Kamil)
> > 
> > Closes: https://gitlab.freedesktop.org/drm/xe/kernel/issues/3343
> > Signed-off-by: Jonathan Cavitt <jonathan.cavitt at intel.com>
> > CC: Francois Dugast <francois.dugast at intel.com>
> > CC: Lucas De Marchi <lucas.demarchi at intel.com>
> > CC: Matthew Brost <matthew.brost at intel.com>
> > CC: Rodrigo Vivi <rodrigo.vivi at intel.com>
> > CC: Michal Wajdeczko <michal.wajdeczko at intel.com>
> > CC: Kamil Konieczny <kamil.konieczny at linux.intel.com>
> 
> My r-b holds so
> Reviewed-by: Kamil Konieczny <kamil.konieczny at linux.intel.com>
> 
> Please respond to CI fails and also look into fails
> related to these tests, I checked few and they seem unrelated.

The failures seen in CI.XEBAT and FI.CI.BAT are all unrelated: this change only
adds error message ignore lists to tests in the xe_fault_injection test suite,
which should not have any effect in other tests where novel errors are occurring.

The failures in XE.CI.FULL are not fully reported yet, so I cannot respond to
them directly as of now.  However, the logs do not show any indication of the
prior observed error, and instead all logs I've seen report the following error
message:

<3> [377.478719] xe 0000:03:00.0: [drm] *ERROR* audio power refcount 1 after unbind

However, this error message was also present in prior revisions of the test, so
this error message is not novel and is very likely unrelated to the change here.

I'll at least respond directly to the CI errors I have access to.

-Jonathan Cavitt

> 
> Regards,
> Kamil
> 
> > ---
> >  tests/intel/xe_fault_injection.c | 27 +++++++++++++++++++++++++++
> >  1 file changed, 27 insertions(+)
> > 
> > diff --git a/tests/intel/xe_fault_injection.c b/tests/intel/xe_fault_injection.c
> > index 1b29041745..7d6c902761 100644
> > --- a/tests/intel/xe_fault_injection.c
> > +++ b/tests/intel/xe_fault_injection.c
> > @@ -50,6 +50,30 @@ static int fail_function_open(void)
> >  	return debugfs_fail_function_dir_fd;
> >  }
> >  
> > +static bool function_is_part_of_guc(const char function_name[])
> > +{
> > +	return strstr(function_name, "_guc_") != NULL ||
> > +	       strstr(function_name, "_uc_") != NULL ||
> > +	       strstr(function_name, "_wopcm_") != NULL;
> > +}
> > +
> > +static void ignore_faults_in_dmesg(const char function_name[])
> > +{
> > +	/* Driver probe is expected to fail in all cases, so ignore in igt_runner */
> > +	char regex[1024] = "probe with driver xe failed with error -12";
> > +
> > +	/*
> > +	 * If GuC module fault is injected, GuC is expected to fail,
> > +	 * so also ignore GuC init failures in igt_runner.
> > +	 */
> > +	if (function_is_part_of_guc(function_name)) {
> > +		strcat(regex, "|GT[0-9a-fA-F]*: GuC init failed with -ENOMEM");
> > +		strcat(regex, "|GT[0-9a-fA-F]*: Failed to initialize uC .-ENOMEM");
> > +	}
> > +
> > +	igt_emit_ignore_dmesg_regex(regex);
> > +}
> > +
> >  /*
> >   * The injectable file requires CONFIG_FUNCTION_ERROR_INJECTION in kernel.
> >   */
> > @@ -152,6 +176,7 @@ inject_fault_probe(int fd, char pci_slot[], const char function_name[])
> >  	igt_info("Injecting error \"%s\" (%d) in function \"%s\"\n",
> >  		 strerror(-INJECT_ERRNO), INJECT_ERRNO, function_name);
> >  
> > +	ignore_faults_in_dmesg(function_name);
> >  	injection_list_do(INJECTION_LIST_ADD, function_name);
> >  	set_retval(function_name, INJECT_ERRNO);
> >  	xe_sysfs_driver_do(fd, pci_slot, XE_SYSFS_DRIVER_TRY_BIND);
> > @@ -184,6 +209,7 @@ vm_create_fail(int fd, const char function_name[], unsigned int flags)
> >  {
> >  	igt_assert_eq(simple_vm_create(fd, flags), 0);
> >  
> > +	ignore_faults_in_dmesg(function_name);
> >  	injection_list_do(INJECTION_LIST_ADD, function_name);
> >  	set_retval(function_name, INJECT_ERRNO);
> >  	igt_assert(simple_vm_create(fd, flags) != 0);
> > @@ -243,6 +269,7 @@ vm_bind_fail(int fd, const char function_name[])
> >  
> >  	igt_assert_eq(simple_vm_bind(fd, vm), 0);
> >  
> > +	ignore_faults_in_dmesg(function_name);
> >  	injection_list_do(INJECTION_LIST_ADD, function_name);
> >  	set_retval(function_name, INJECT_ERRNO);
> >  	igt_assert(simple_vm_bind(fd, vm) != 0);
> > -- 
> > 2.43.0
> > 
> 


More information about the igt-dev mailing list