[PATCH i-g-t v5] tests/intel/xe_fault_injection: Suppress Guc CT dumps during fault injection
K V P, Satyanarayana
satyanarayana.k.v.p at intel.com
Thu Jun 12 16:38:00 UTC 2025
> -----Original Message-----
> From: Cavitt, Jonathan <jonathan.cavitt at intel.com>
> Sent: Thursday, June 12, 2025 7:50 PM
> To: K V P, Satyanarayana <satyanarayana.k.v.p at intel.com>; igt-
> dev at lists.freedesktop.org
> Cc: K V P, Satyanarayana <satyanarayana.k.v.p at intel.com>; Harrison, John C
> <john.c.harrison at intel.com>; Dugast, Francois <francois.dugast at intel.com>;
> Cavitt, Jonathan <jonathan.cavitt at intel.com>
> Subject: RE: [PATCH i-g-t v5] tests/intel/xe_fault_injection: Suppress Guc CT
> dumps during fault injection
>
> -----Original Message-----
> From: Satyanarayana K V P <satyanarayana.k.v.p at intel.com>
> Sent: Thursday, June 12, 2025 1:15 AM
> To: igt-dev at lists.freedesktop.org
> Cc: K V P, Satyanarayana <satyanarayana.k.v.p at intel.com>; Harrison, John C
> <john.c.harrison at intel.com>; Cavitt, Jonathan <jonathan.cavitt at intel.com>;
> Dugast, Francois <francois.dugast at intel.com>
> Subject: [PATCH i-g-t v5] tests/intel/xe_fault_injection: Suppress Guc CT
> dumps during fault injection
> >
> > When injecting fault to xe_guc_ct_send_recv() & xe_guc_mmio_send_recv()
> > functions, the CI test systems are going out of space and crashing. To
> > avoid this issue, a new helper function is created and when fault is
> > injected into this xe_should_fail_ct_dead_capture() helper function,
> > ct dead capture is avoided which suppresses ct dumps in the log.
> >
> > Inject fault into xe_should_fail_ct_dead_capture() function along with
> > xe_guc_ct_send_recv() & xe_guc_mmio_send_recv() to suppress GUC ct
> dumps.
> >
> > Signed-off-by: Satyanarayana K V P <satyanarayana.k.v.p at intel.com>
> > Suggested-by: John Harrison <John.C.Harrison at Intel.com>
> > Cc: Jonathan Cavitt <jonathan.cavitt at intel.com>
> > Cc: Francois Dugast <francois.dugast at intel.com>
> > ---
> > Same as https://patchwork.freedesktop.org/series/148416/ which was
> > reverted due to change from XE was still in review.
> >
> > V4 -> V5:
> > - Fixed review comments (Jonathan).
> >
> > Test-with: 20250612080402.22011-1-satyanarayana.k.v.p at intel.com
> > ---
> > tests/intel/xe_fault_injection.c | 30 ++++++++++++++++++++++++++++++
> > 1 file changed, 30 insertions(+)
> >
> > diff --git a/tests/intel/xe_fault_injection.c b/tests/intel/xe_fault_injection.c
> > index aa3a3a7c2..8245c558c 100644
> > --- a/tests/intel/xe_fault_injection.c
> > +++ b/tests/intel/xe_fault_injection.c
> > @@ -113,6 +113,22 @@ static void injection_list_add(const char
> function_name[])
> > close(dir);
> > }
> >
> > +static void injection_list_append(const char function_name[])
> > +{
> > + int dir, fd, ret;
> > +
> > + dir = fail_function_open();
> > + igt_assert_lte(0, dir);
> > +
> > + fd = openat(dir, "inject", O_WRONLY | O_APPEND);
> > + igt_assert_lte(0, fd);
> > + ret = write(fd, function_name, strlen(function_name));
> > + igt_assert_lte(0, ret);
> > +
> > + close(fd);
> > + close(dir);
> > +}
> > +
> > static void injection_list_remove(const char function_name[])
> > {
> > int dir;
> > @@ -192,6 +208,18 @@ static void set_retval(const char function_name[],
> long long retval)
> > close(dir);
> > }
> >
> > +static void ignore_fail_dump_in_dmesg(const char function_name[], bool
> enable)
> > +{
> > + if (strstr(function_name, "send_recv")) {
> > + if (enable) {
> > + injection_list_append("xe_is_injection_active");
> > + set_retval("xe_is_injection_active", INJECT_ERRNO);
> > + } else {
> > + injection_list_remove("xe_is_injection_active");
> > + }
> > + }
> > +}
> > +
> > /**
> > * SUBTEST: inject-fault-probe-function-%s
> > * Description: inject an error in the injectable function %arg[1] then
> > @@ -227,11 +255,13 @@ inject_fault_probe(int fd, const char pci_slot[],
> const char function_name[])
> > ignore_dmesg_errors_from_dut(pci_slot);
> > injection_list_add(function_name);
> > set_retval(function_name, INJECT_ERRNO);
> > + ignore_fail_dump_in_dmesg(function_name, true);
> >
> > igt_kmod_bind("xe", pci_slot);
> >
> > err = -errno;
> > injection_list_remove(function_name);
> > + ignore_fail_dump_in_dmesg(function_name, false);
>
> This is better, though I think there's a stigma against giving a single function a
> boolean
> mode switch like this. I don't know where that stigma came from, but it might
> be
> preferrable for you to just break the "false" case out and run it here directly.
>
> Perhaps something like:
>
> """
> static bool ignore_fail_dump_in_dmesg(const char function_name[])
> {
> bool ret = !!strstr(function_name, "send_recv");
>
> if (ret) {
> injection_list_append("xe_is_injection_active");
> set_retval("xe_is_injection_active", INJECT_ERRNO);
> }
> return ret;
> }
> ...
> ignore_dmesg_errors_from_dut(pci_slot);
> injection_list_add(function_name);
> set_retval(function_name, INJECT_ERRNO);
> ignore_dump = ignore_fail_dump_in_dmesg(function_name);
>
> igt_kmod_bind("xe", pci_slot);
>
> err = -errno;
> injection_list_remove(function_name);
> if (ignore_dump)
> injection_list_remove("xe_is_injection_active");
> """
>
> If you do that, you can have my
> Reviewed-by: Jonathan Cavitt <jonathan.cavitt at intel.com>
> -Jonathan Cavitt
>
I do not think, I am going to implement above method as it increases LOC count
and not seeing any benefit with implementing this.
Let me know if you see any benefit with suggested approach.
-Satya.
> > return err;
> > }
> > --
> > 2.43.0
> >
> >
More information about the igt-dev
mailing list