[PATCH i-g-t] tests/intel/xe_fault_injection: Ignore enable activity stats error
Cavitt, Jonathan
jonathan.cavitt at intel.com
Wed May 28 22:36:17 UTC 2025
-----Original Message-----
From: Wajdeczko, Michal <Michal.Wajdeczko at intel.com>
Sent: Wednesday, May 28, 2025 2:54 PM
To: K V P, Satyanarayana <satyanarayana.k.v.p at intel.com>; igt-dev at lists.freedesktop.org; De Marchi, Lucas <lucas.demarchi at intel.com>; Vivi, Rodrigo <rodrigo.vivi at intel.com>
Cc: Cavitt, Jonathan <jonathan.cavitt at intel.com>; Harrison, John C <john.c.harrison at intel.com>
Subject: Re: [PATCH i-g-t] tests/intel/xe_fault_injection: Ignore enable activity stats error
>
> On 28.05.2025 15:30, Satyanarayana K V P wrote:
> > Add some more GuC fault messages in the dmesg ignore list.
> >
> > Signed-off-by: Satyanarayana K V P <satyanarayana.k.v.p at intel.com>
> > ---
> > Cc: Jonathan Cavitt <jonathan.cavitt at intel.com>
> > Cc: John Harrison <John.C.Harrison at Intel.com>
> > ---
> > tests/intel/xe_fault_injection.c | 1 +
> > 1 file changed, 1 insertion(+)
> >
> > diff --git a/tests/intel/xe_fault_injection.c b/tests/intel/xe_fault_injection.c
> > index f9bd5c761..4ca8a34bf 100644
> > --- a/tests/intel/xe_fault_injection.c
> > +++ b/tests/intel/xe_fault_injection.c
> > @@ -85,6 +85,7 @@ static void ignore_faults_in_dmesg(const char function_name[])
> > strcat(regex, "|GT[0-9a-fA-F]*: Failed to initialize uC .-ENOMEM");
> > strcat(regex, "|GT[0-9a-fA-F]*: Failed to enable GuC CT .-ENOMEM");
> > strcat(regex, "|GT[0-9a-fA-F]*: GuC PC query task state failed: -ENOMEM");
> > + strcat(regex, "|GT[0-9a-fA-F]*: failed to enable activity stats-[0-9]*");
>
> are we really going to add more and more specific error messages to the
> IGT filter just to make CI happy? note that this could be never ending
> story and still very fragile to any changes on the driver side in
> functions or component reordering or improvements in messages or driver
> attempts to recover from non-fatal errors
>
> can't we just ignore *ALL* error messages during fault injection tests?
>
> after all, as we are injecting errors, our only expectation should be
> that driver will not crash or hit asserts, regardless how many error
> messages will be printed in the meantime, as we don't expect error free
> run here and we can't expect specific error message either
This is pretty similar to something I was going to remark on when I was
providing my RB for this patch earlier, but that I elected to cut from my
email. Basically, yes, it feels like we're playing a constant game of
cat-and-mouse with these various GuC errors, all for the sake of making
a line on a bar slightly more green. And the endless pursuit of
suppressing these errors is just going to cause the regex list to balloon
out of control eventually.
However, we're currently "only" suppressing 5 or 6 errors, and I don't
think suppressing every dmesg error here is wise, per se (E.G., if we see
an error on a module that's loaded before the GuC module during the
GuC fault injection tests, I don't think that'd be expected. And if we failed
to initialize uC for non-ENOMEM reasons, that would also be cause for
concern, no?).
But for the sake of argument, this is what that would look like in code (I think):
"""
static void ignore_faults_in_dmesg(void)
{
/*
* Fault injection tests should only fail on hard errors and crashes.
* Ignore all other dmesg errors and warnings.
*/
igt_emit_ignore_dmesg_regex(".*");
}
"""
-Jonathan Cavitt
>
> > }
> >
> > igt_emit_ignore_dmesg_regex(regex);
>
>
More information about the igt-dev
mailing list