[PATCH i-g-t v5] tests/intel/xe_fault_injection: Suppress Guc CT dumps during fault injection
Cavitt, Jonathan
jonathan.cavitt at intel.com
Thu Jun 12 16:49:31 UTC 2025
-----Original Message-----
From: K V P, Satyanarayana <satyanarayana.k.v.p at intel.com>
Sent: Thursday, June 12, 2025 9:38 AM
To: Cavitt, Jonathan <jonathan.cavitt at intel.com>; igt-dev at lists.freedesktop.org
Cc: Harrison, John C <john.c.harrison at intel.com>; Dugast, Francois <francois.dugast at intel.com>; K V P, Satyanarayana <satyanarayana.k.v.p at intel.com>
Subject: RE: [PATCH i-g-t v5] tests/intel/xe_fault_injection: Suppress Guc CT dumps during fault injection
>
> > -----Original Message-----
> > From: Cavitt, Jonathan <jonathan.cavitt at intel.com>
> > Sent: Thursday, June 12, 2025 7:50 PM
> > To: K V P, Satyanarayana <satyanarayana.k.v.p at intel.com>; igt-
> > dev at lists.freedesktop.org
> > Cc: K V P, Satyanarayana <satyanarayana.k.v.p at intel.com>; Harrison, John C
> > <john.c.harrison at intel.com>; Dugast, Francois <francois.dugast at intel.com>;
> > Cavitt, Jonathan <jonathan.cavitt at intel.com>
> > Subject: RE: [PATCH i-g-t v5] tests/intel/xe_fault_injection: Suppress Guc CT
> > dumps during fault injection
> >
> > -----Original Message-----
> > From: Satyanarayana K V P <satyanarayana.k.v.p at intel.com>
> > Sent: Thursday, June 12, 2025 1:15 AM
> > To: igt-dev at lists.freedesktop.org
> > Cc: K V P, Satyanarayana <satyanarayana.k.v.p at intel.com>; Harrison, John C
> > <john.c.harrison at intel.com>; Cavitt, Jonathan <jonathan.cavitt at intel.com>;
> > Dugast, Francois <francois.dugast at intel.com>
> > Subject: [PATCH i-g-t v5] tests/intel/xe_fault_injection: Suppress Guc CT
> > dumps during fault injection
> > >
> > > When injecting fault to xe_guc_ct_send_recv() & xe_guc_mmio_send_recv()
> > > functions, the CI test systems are going out of space and crashing. To
> > > avoid this issue, a new helper function is created and when fault is
> > > injected into this xe_should_fail_ct_dead_capture() helper function,
> > > ct dead capture is avoided which suppresses ct dumps in the log.
> > >
> > > Inject fault into xe_should_fail_ct_dead_capture() function along with
> > > xe_guc_ct_send_recv() & xe_guc_mmio_send_recv() to suppress GUC ct
> > dumps.
> > >
> > > Signed-off-by: Satyanarayana K V P <satyanarayana.k.v.p at intel.com>
> > > Suggested-by: John Harrison <John.C.Harrison at Intel.com>
> > > Cc: Jonathan Cavitt <jonathan.cavitt at intel.com>
> > > Cc: Francois Dugast <francois.dugast at intel.com>
> > > ---
> > > Same as https://patchwork.freedesktop.org/series/148416/ which was
> > > reverted due to change from XE was still in review.
> > >
> > > V4 -> V5:
> > > - Fixed review comments (Jonathan).
> > >
> > > Test-with: 20250612080402.22011-1-satyanarayana.k.v.p at intel.com
> > > ---
> > > tests/intel/xe_fault_injection.c | 30 ++++++++++++++++++++++++++++++
> > > 1 file changed, 30 insertions(+)
> > >
> > > diff --git a/tests/intel/xe_fault_injection.c b/tests/intel/xe_fault_injection.c
> > > index aa3a3a7c2..8245c558c 100644
> > > --- a/tests/intel/xe_fault_injection.c
> > > +++ b/tests/intel/xe_fault_injection.c
> > > @@ -113,6 +113,22 @@ static void injection_list_add(const char
> > function_name[])
> > > close(dir);
> > > }
> > >
> > > +static void injection_list_append(const char function_name[])
> > > +{
> > > + int dir, fd, ret;
> > > +
> > > + dir = fail_function_open();
> > > + igt_assert_lte(0, dir);
> > > +
> > > + fd = openat(dir, "inject", O_WRONLY | O_APPEND);
> > > + igt_assert_lte(0, fd);
> > > + ret = write(fd, function_name, strlen(function_name));
> > > + igt_assert_lte(0, ret);
> > > +
> > > + close(fd);
> > > + close(dir);
> > > +}
> > > +
> > > static void injection_list_remove(const char function_name[])
> > > {
> > > int dir;
> > > @@ -192,6 +208,18 @@ static void set_retval(const char function_name[],
> > long long retval)
> > > close(dir);
> > > }
> > >
> > > +static void ignore_fail_dump_in_dmesg(const char function_name[], bool
> > enable)
> > > +{
> > > + if (strstr(function_name, "send_recv")) {
> > > + if (enable) {
> > > + injection_list_append("xe_is_injection_active");
> > > + set_retval("xe_is_injection_active", INJECT_ERRNO);
> > > + } else {
> > > + injection_list_remove("xe_is_injection_active");
> > > + }
> > > + }
> > > +}
> > > +
> > > /**
> > > * SUBTEST: inject-fault-probe-function-%s
> > > * Description: inject an error in the injectable function %arg[1] then
> > > @@ -227,11 +255,13 @@ inject_fault_probe(int fd, const char pci_slot[],
> > const char function_name[])
> > > ignore_dmesg_errors_from_dut(pci_slot);
> > > injection_list_add(function_name);
> > > set_retval(function_name, INJECT_ERRNO);
> > > + ignore_fail_dump_in_dmesg(function_name, true);
> > >
> > > igt_kmod_bind("xe", pci_slot);
> > >
> > > err = -errno;
> > > injection_list_remove(function_name);
> > > + ignore_fail_dump_in_dmesg(function_name, false);
> >
> > This is better, though I think there's a stigma against giving a single function a
> > boolean
> > mode switch like this. I don't know where that stigma came from, but it might
> > be
> > preferrable for you to just break the "false" case out and run it here directly.
> >
> > Perhaps something like:
> >
> > """
> > static bool ignore_fail_dump_in_dmesg(const char function_name[])
> > {
> > bool ret = !!strstr(function_name, "send_recv");
> >
> > if (ret) {
> > injection_list_append("xe_is_injection_active");
> > set_retval("xe_is_injection_active", INJECT_ERRNO);
> > }
> > return ret;
> > }
> > ...
> > ignore_dmesg_errors_from_dut(pci_slot);
> > injection_list_add(function_name);
> > set_retval(function_name, INJECT_ERRNO);
> > ignore_dump = ignore_fail_dump_in_dmesg(function_name);
> >
> > igt_kmod_bind("xe", pci_slot);
> >
> > err = -errno;
> > injection_list_remove(function_name);
> > if (ignore_dump)
> > injection_list_remove("xe_is_injection_active");
> > """
> >
> > If you do that, you can have my
> > Reviewed-by: Jonathan Cavitt <jonathan.cavitt at intel.com>
> > -Jonathan Cavitt
> >
> I do not think, I am going to implement above method as it increases LOC count
> and not seeing any benefit with implementing this.
> Let me know if you see any benefit with suggested approach.
This isn't so much about the lines of change or the performance. It's about the
expected style guidelines used for IGT. I'll admit that I might not be remembering
the style guidelines quite correctly and that this function could be acceptably within
the given guidelines, though, so I'll give my reviewed-by for now and request you
wait for someone else to also give this a look.
Reviewed-by: Jonathan Cavitt <jonathan.cavitt at intel.com>
-Jonathan Cavitt
> -Satya.
> > > return err;
> > > }
> > > --
> > > 2.43.0
> > >
> > >
>
More information about the igt-dev
mailing list