[Intel-gfx] [PATCH 1/2] drm/i915/guc: Add error-capture init warnings when needed

Teres Alexis, Alan Previn alan.previn.teres.alexis at intel.com
Mon Oct 17 17:46:24 UTC 2022



On Mon, 2022-10-17 at 09:42 +0100, Tvrtko Ursulin wrote:
> On 15/10/2022 04:59, Alan Previn wrote:
> > If GuC is being used and we initialized GuC-error-capture,
> > we need to be warning if we don't provide an error-capture
> > register list in the firmware ADS, for valid GT engines.
> > A warning makes sense as this would impact debugability
> > without realizing why a reglist wasn't retrieved and reported
> > by GuC.> 
> > However, depending on the platform, we might have certain
> > engines that have a register list for engine instance error state
> > but not for engine class. Thus, add a check only to warn if the
> > register list was non existent vs an empty list (use the
> > empty lists to skip the warning).
> > 
> > Signed-off-by: Alan Previn <alan.previn.teres.alexis at intel.com>
> > ---
> >   .../gpu/drm/i915/gt/uc/intel_guc_capture.c    | 55 ++++++++++++++++++-
> >   1 file changed, 53 insertions(+), 2 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_capture.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_capture.c
> > index 8f1165146013..290c1e1343dd 100644
> > --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_capture.c
> > +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_capture.c
> > @@ -419,6 +419,44 @@ guc_capture_get_device_reglist(struct intel_guc *guc)
> >   	return default_lists;
> >   }
> >   
> > +static const char *
> > +__stringify_type(u32 type)
> > +{
> > +	switch (type) {
> > +	case GUC_CAPTURE_LIST_TYPE_GLOBAL:
> > +		return "Global";
> > +	case GUC_CAPTURE_LIST_TYPE_ENGINE_CLASS:
> > +		return "Class";
> > +	case GUC_CAPTURE_LIST_TYPE_ENGINE_INSTANCE:
> > +		return "Instance";
> > +	default:
> > +		return "unknown";
> > +	}
> > +
> > +	return "";
> 
> What's the point of these returns? Gcc complains about not returning a type from const char * return function?
> 
Sorry i am not sure I saw any complain for Gcc. If you are referring to "" then i can re-rev without the default and
just let the path outside return the unknown. Is that what you are referring to?

> > +}
> > +
> > +static const char *
> > +__stringify_engclass(u32 class)
> > +{
> > +	switch (class) {
> > +	case GUC_RENDER_CLASS:
> > +		return "Render";
> > +	case GUC_VIDEO_CLASS:
> > +		return "Video";
> > +	case GUC_VIDEOENHANCE_CLASS:
> > +		return "VideoEnhance";
> > +	case GUC_BLITTER_CLASS:
> > +		return "Blitter";
> > +	case GUC_COMPUTE_CLASS:
> > +		return "Compute";
> > +	default:
> > +		return "unknown";
> > +	}
> > +
> > +	return "";
> > +}
> > +
> >   static int
> >   guc_capture_list_init(struct intel_guc *guc, u32 owner, u32 type, u32 classid,
> >   		      struct guc_mmio_reg *ptr, u16 num_entries)
> > @@ -487,19 +525,32 @@ intel_guc_capture_getlistsize(struct intel_guc *guc, u32 owner, u32 type, u32 cl
> >   			      size_t *size)
> >   {
> >   	struct intel_guc_state_capture *gc = guc->capture;
> > +	struct drm_i915_private *i915 = guc_to_gt(guc)->i915;
> >   	struct __guc_capture_ads_cache *cache = &gc->ads_cache[owner][type][classid];
> >   	int num_regs;
> >   
> > -	if (!gc->reglists)
> > +	if (!gc->reglists) {
> > +		drm_warn(&i915->drm, "GuC-capture: No reglist on this device\n");
> >   		return -ENODEV;
> > +	}
> >   
> >   	if (cache->is_valid) {
> >   		*size = cache->size;
> >   		return cache->status;
> >   	}
> >   
> > +	if (!guc_capture_get_one_list(gc->reglists, owner, type, classid)) {
> > +		if (owner == GUC_CAPTURE_LIST_INDEX_PF && type == GUC_CAPTURE_LIST_TYPE_GLOBAL)
> > +			drm_warn(&i915->drm, "GuC-capture: missing reglist type-Global\n");
> > +		if (owner == GUC_CAPTURE_LIST_INDEX_PF)
> 
> GUC_CAPTURE_LIST_INDEX_PF could be made once on the enclosing if statement?
Sure - will do.
> 
> Btw what's with the PF and VF (cover letter) references while SRIOV does not exists upstream?
To maintain a scalable code flow across both the ADS code and guc-error-capture code, we do have to skip over this enum
else we'll encounter lots of warnings about missing VF-reglist support (which we cant check for since we dont even have
support - i.e we dont even have a "is not supported" check) but the GuC firmware ADS buffer allocation includes an entry
for VFs so we have to skip over it. This is the cleanest way i can think of without impacting other code areas and also
while adding the ability to warn when its important.
> > +			drm_warn(&i915->drm, "GuC-capture: missing regiist type(%d)-%s : "
> 
> reglist
thanks - will fix
> 
> > +				 "%s(%d)-Engine\n", type, __stringify_type(type),
> > +				 __stringify_engclass(classid), classid);
> 
> One details to consider from Documentation/process/coding-style.rst
> """
> However, never break user-visible strings such as printk messages because that breaks the ability to grep for them.
> """
> 
I totally agree with you but i cant find a way to keep totally grep-able way without creating a whole set of error
strings for the various list-types, list-owners and class-types. However i did ensure the first part of the message
is grep-able : "GuC-capture: missing reglist type". Do you an alternative proposal?

> Also commit message you can aim to wrap at 75 chars as per submitting-patches.rst.
> 
> > +		return -ENODATA;
> 
> Is this a new exit condition or the thing would exit on the !num_regs check below anyway? Just wondering if the series is only about logging changes or there is more to it.
Its no different from previous behavior - and yes its about logging the missing reg lists for hw that needs it as we are
missing this for DG2 we we didn't notice (we did a previous revert to remove these warnings but that time the warnings
was too verbose - even complaining for the intentional empty lists and for VF cases that isnt supported).
> 
> > +	}
> > +
> >   	num_regs = guc_cap_list_num_regs(gc, owner, type, classid);
> > -	if (!num_regs)
> > +	if (!num_regs) /* intentional empty lists can exist depending on hw config */
> >   		return -ENODATA;
> >   
> >   	*size = PAGE_ALIGN((sizeof(struct guc_debug_capture_list)) +
> 
> Regards,
> 
> Tvrtko



More information about the Intel-gfx mailing list