[PATCH v2] drm/xe: Add helper function to inject fault into ct_dead_capture()
K V P, Satyanarayana
satyanarayana.k.v.p at intel.com
Wed May 7 05:13:55 UTC 2025
> -----Original Message-----
> From: Harrison, John C <john.c.harrison at intel.com>
> Sent: Wednesday, May 7, 2025 4:50 AM
> To: K V P, Satyanarayana <satyanarayana.k.v.p at intel.com>; intel-
> xe at lists.freedesktop.org
> Cc: Chauhan, Aditya <aditya.chauhan at intel.com>; Nikula, Jani
> <jani.nikula at intel.com>
> Subject: Re: [PATCH v2] drm/xe: Add helper function to inject fault into
> ct_dead_capture()
>
> On 4/30/2025 6:17 AM, Satyanarayana K V P wrote:
> > When injecting fault to xe_guc_ct_send_recv() & xe_guc_mmio_send_recv()
> > functions, the CI test systems are going out of space and crashing. To
> > avoid this issue, a new helper function is created and when fault is
> > injected into this xe_should_fail_ct_dead_capture() helper function,
> > ct dead capture is avoided which suppresses ct dumps in the log.
> >
> > Signed-off-by: Satyanarayana K V P <satyanarayana.k.v.p at intel.com>
> > Suggested-by: John Harrison <John.C.Harrison at Intel.com>
> > Tested-by: Aditya Chauhan <aditya.chauhan at intel.com>
> >
> > ---
> > Cc: Jani Nikula <jani.nikula at intel.com>
> >
> > V1 -> V2:
> > - Fixed review comments.
> > ---
> > drivers/gpu/drm/xe/xe_guc_ct.c | 21 +++++++++++++++++++++
> > 1 file changed, 21 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/xe/xe_guc_ct.c
> b/drivers/gpu/drm/xe/xe_guc_ct.c
> > index 2447de0ebedf..d6e7a8b80d8c 100644
> > --- a/drivers/gpu/drm/xe/xe_guc_ct.c
> > +++ b/drivers/gpu/drm/xe/xe_guc_ct.c
> > @@ -1770,6 +1770,20 @@ void xe_guc_ct_print(struct xe_guc_ct *ct,
> struct drm_printer *p, bool want_ctb)
> > }
> >
> > #if IS_ENABLED(CONFIG_DRM_XE_DEBUG)
> > +/**
> > + * xe_should_fail_ct_dead_capture - Helper function to inject fault.
> > + *
> > + * This is a helper function to inject fault into ct_dead_capture().
> > + * As fault is injected using this function, need to make sure that
> > + * the compiler does not optimize and make it as a inline function.
> > + * To prevent compile optimization, "noinline" is added.
> > + */
> > +static noinline int xe_should_fail_ct_dead_capture(void)
> > +{
> > + return 0;
> > +}
> > +ALLOW_ERROR_INJECTION(xe_should_fail_ct_dead_capture, ERRNO);
> > +
> > static void ct_dead_capture(struct xe_guc_ct *ct, struct guc_ctb *ctb, u32
> reason_code)
> > {
> > struct xe_guc_log_snapshot *snapshot_log;
> > @@ -1778,6 +1792,13 @@ static void ct_dead_capture(struct xe_guc_ct
> *ct, struct guc_ctb *ctb, u32 reaso
> > unsigned long flags;
> > bool have_capture;
> >
> > + /*
> > + * Huge dump is getting generated when injecting error for guc
> CT/MMIO
> > + * functions. So, let us suppress the dump when fault is injected.
> > + */
> > + if (xe_should_fail_ct_dead_capture())
> Is it worth making this a more generic 'is_error_fault_injected()'? Then
> it can be used by random other bits of code if/when necessary. And maybe
I do not think we can have a generic error injection function. If the generic error injection function is called at multiple places
(may be in future), then we may not inject error at point where we intend to inject as the first call will inject the error.
> also have an inline/#define version for when
> CONFIG_FUNCTION_ERROR_INJECTION is not defined?
>
Will do and send new patch.
> John.
>
> > + return;
> > +
> > if (ctb)
> > ctb->info.broken = true;
> >
More information about the Intel-xe
mailing list