[Intel-xe] [PATCH 7/7] drm/xe/guc: Port workarounds to OOB infra

Lucas De Marchi lucas.demarchi at intel.com
Thu May 4 21:56:39 UTC 2023


On Thu, May 04, 2023 at 12:37:05PM -0700, Jose Souza wrote:
>On Thu, 2023-05-04 at 12:30 -0700, Lucas De Marchi wrote:
>> On Thu, May 04, 2023 at 03:30:48PM +0000, Jose Souza wrote:
>> > On Thu, 2023-05-04 at 00:32 -0700, Lucas De Marchi wrote:
>> > > Let xe_guc.c start using XE_WA() for workarounds. One workaround
>> > > couldn't be converted due to no support for function hook checks.
>> > >
>> > > Signed-off-by: Lucas De Marchi <lucas.demarchi at intel.com>
>> > > ---
>> > >  drivers/gpu/drm/xe/xe_guc.c | 40 +++++++++++++------------------------
>> > >  1 file changed, 14 insertions(+), 26 deletions(-)
>> > >
>> > > diff --git a/drivers/gpu/drm/xe/xe_guc.c b/drivers/gpu/drm/xe/xe_guc.c
>> > > index f7d32b744247..eea5ed015282 100644
>> > > --- a/drivers/gpu/drm/xe/xe_guc.c
>> > > +++ b/drivers/gpu/drm/xe/xe_guc.c
>> > > @@ -19,7 +19,9 @@
>> > >  #include "xe_guc_submit.h"
>> > >  #include "xe_mmio.h"
>> > >  #include "xe_platform_types.h"
>> > > +#include "xe_rtp.h"
>> > >  #include "xe_uc_fw.h"
>> > > +#include "xe_wa.h"
>> > >  #include "xe_wopcm.h"
>> > >
>> > >  #define MEDIA_GUC_HOST_INTERRUPT        XE_REG(0x190304)
>> > > @@ -136,29 +138,22 @@ static u32 guc_ctl_wa_flags(struct xe_guc *guc)
>> > >  	struct xe_gt *gt = guc_to_gt(guc);
>> > >  	u32 flags = 0;
>> > >
>> > > -	/* Wa_22012773006:gen11,gen12 < XeHP */
>> > > -	if (GRAPHICS_VER(xe) >= 11 &&
>> > > -	    GRAPHICS_VERx100(xe) < 1250)
>> > > +	if (XE_WA(xe, "22012773006", GRAPHICS_VERSION_RANGE(1100, 1250)))
>> > >  		flags |= GUC_WA_POLLCS;
>> >
>> > Just a suggestion but would be better to have the workaround rule defined somewhere else and here only have 'if (XE_WA(xe, "22012773006"))'.
>> >
>> >
>> > Mesa now has something like that.
>> > We have a Python script that generates a JSON file with workaround data from the workaround database API.
>> > During build time, another Python script generates the intel_wa.c file using the information in JSON, setting a bit for each workaround that applies
>> > to each platform.
>> > In runtime it get the bit field with the workarounds that applies to running platform on given stepping.
>>
>> that is one of the alternatives I shared in the RFC. There are
>> some issues with such approach though:
>>
>> 1) The condition is in a totally different place tha have to be checked
>>     to follow the code flow.
>>
>>     The upside is that if the same condition needs to be checked again in
>>     multiple places for these OOB workarounds, the condition doesn't
>>     need to be repeated
>>
>> 2) Not possible to encode conditions that are not platform-centric or
>> that needs to be decided in runtime (e.g. if an engine is fused off or
>> not)
>>
>> > In code Mesa we have 'if (intel_needs_workaround(devinfo, 14015055625))', Xe kmd would probably need additional checks like 'if (engine == BLIT &&
>> > XE_WA(xe, "xxxxxx"))'.
>>
>> yeah... that maybe solves the issue (2) above. We'd need to leave the
>> non-platform checks to each callers. And maybe follow the same pattern
>> for the gt/engine/lrc workarounds.
>>
>> >
>> > This solve the problem we have were workarounds are left enabled in platforms that already have a HW fix for it.
>>
>> not sure how that solves this issue though. We mark the stepping in
>> which the workaround are supposedly fixed.
>
>Mesa problems was with permanent workarounds, a example:
>
>Some workaround is required by all gen12.0 and dg2 and was implemented with: 'if (gt_ver >= 12)' but then it was fixed in MTL.
>It is pretty easy to forget to update it.


but then the solution is not about where the check is, i.e. in the
caller vs a centralized place. The solution is to generate the
version range or platform from the spec.  If we were to move to a
centralized place without generating that automatically we'd still have
the same issue.... which goes back to the issue of having code
generation in the kernel.

 From a quick look at the mesa impl, what I don't quite understand after
grepping for intel_needs_workaround() and looking a the .cpp/.h
generation:  how do you check if the workaround is implemented/active?
Because 1) you generated a .json from the database, 2) you generated
the condition to enable the workaround for *all* platforms; 3) Now you
can't rely anymore on having that definition to decide it's
implemented/active.  Do you track callers of intel_needs_workaround?
Something else? This seems a little short based on the size
of mesa_defs.json. Or is it just because the past platforms weren't
converted to this new infra?

git grep -h intel_needs_workaround | sed -n 's/.*intel_needs_workaround(.*,\s*\([0-9]*\)).*/\1/p' | sort -u
14010017096
14012437816
14012688258
14014063774
14015297576
14015360517
14015590813
14017341140
16013063087
22012575642
22012785325
22013689345

Also Cc'ing here some other  people involved with the mesa impl to check
if we can improve kernel side.

thanks
Lucas De Marchi

>
>>
>> >
>> > Not sure if such think would be acceptable in Linux, just sharing what Mesa have done.
>>
>> Generating code to be built is done in a few places in the kernel
>> I already suggested doing that for the register definitions, but
>> there are always push back.  I will take a look at this again.
>>
>> Thanks
>> Lucas De Marchi
>>
>> >
>> > >
>> > > -	/* Wa_16011759253 */
>> > > -	/* Wa_22011383443 */
>> > > -	if (IS_SUBPLATFORM_STEP(xe, XE_DG2, XE_SUBPLATFORM_DG2_G10, STEP_A0, STEP_B0))
>> > > +	if (XE_WA(xe, "16011759253, 22011383443",
>> > > +		  SUBPLATFORM(DG2, G10), STEP(A0, B0)))
>> > >  		flags |= GUC_WA_GAM_CREDITS;
>> > >
>> > > -	/* Wa_14014475959 */
>> > > -	if (IS_PLATFORM_STEP(xe, XE_METEORLAKE, STEP_A0, STEP_B0) ||
>> > > -	    xe->info.platform == XE_DG2)
>> > > +	if (XE_WA(xe, "14014475959[mtl]", PLATFORM(METEORLAKE), STEP(A0, B0)) ||
>> > > +	    XE_WA(xe, "14014475959[dg2]", PLATFORM(DG2)))
>> > >  		flags |= GUC_WA_HOLD_CCS_SWITCHOUT;
>> > >
>> > >  	/*
>> > > -	 * Wa_14012197797
>> > > -	 * Wa_22011391025
>> > > -	 *
>> > >  	 * The same WA bit is used for both and 22011391025 is applicable to
>> > >  	 * all DG2.
>> > >  	 */
>> > > -	if (xe->info.platform == XE_DG2)
>> > > +	if (XE_WA(xe, "14012197797, 22011391025", PLATFORM(DG2)))
>> > >  		flags |= GUC_WA_DUAL_QUEUE;
>> > >
>> > >  	/*
>> > > @@ -169,27 +164,20 @@ static u32 guc_ctl_wa_flags(struct xe_guc *guc)
>> > >  	if (GRAPHICS_VERx100(xe) < 1270)
>> > >  		flags |= GUC_WA_PRE_PARSER;
>> > >
>> > > -	/* Wa_16011777198 */
>> > > -	if (IS_SUBPLATFORM_STEP(xe, XE_DG2, XE_SUBPLATFORM_DG2_G10, STEP_A0, STEP_C0) ||
>> > > -	    IS_SUBPLATFORM_STEP(xe, XE_DG2, XE_SUBPLATFORM_DG2_G11, STEP_A0,
>> > > -				STEP_B0))
>> > > +	if (XE_WA(xe, "16011777198[dg2_g10]", SUBPLATFORM(DG2, G10), STEP(A0, C0)) ||
>> > > +	    XE_WA(xe, "16011777198[dg2_g11]", SUBPLATFORM(DG2, G11), STEP(A0, B0)))
>> > >  		flags |= GUC_WA_RCS_RESET_BEFORE_RC6;
>> > >
>> > > -	/*
>> > > -	 * Wa_22012727170
>> > > -	 * Wa_22012727685
>> > > -	 */
>> > > -	if (IS_SUBPLATFORM_STEP(xe, XE_DG2, XE_SUBPLATFORM_DG2_G10, STEP_A0, STEP_C0) ||
>> > > -	    IS_SUBPLATFORM_STEP(xe, XE_DG2, XE_SUBPLATFORM_DG2_G11, STEP_A0,
>> > > -				STEP_FOREVER))
>> > > +	if (XE_WA(xe, "22012727170, 22012727685 [dg2_g10]", SUBPLATFORM(DG2, G10), STEP(A0, C0)) ||
>> > > +	    XE_WA(xe, "22012727170, 22012727685 [dg2_g11]", SUBPLATFORM(DG2, G11), STEP(A0, FOREVER)))
>> > >  		flags |= GUC_WA_CONTEXT_ISOLATION;
>> > >
>> > >  	/* Wa_16015675438, Wa_18020744125 */
>> > > +	/* FIXME: currently there is no support for gt checks in XE_WA() */
>> > >  	if (!xe_hw_engine_mask_per_class(gt, XE_ENGINE_CLASS_RENDER))
>> > >  		flags |= GUC_WA_RCS_REGS_IN_CCS_REGS_LIST;
>> > >
>> > > -	/* Wa_1509372804 */
>> > > -	if (IS_PLATFORM_STEP(xe, XE_PVC, STEP_B0, STEP_C0))
>> > > +	if (XE_WA(xe, "1509372804", PLATFORM(PVC), STEP(B0, C0)))
>> > >  		flags |= GUC_WA_RENDER_RST_RC6_EXIT;
>> > >
>> > >  	return flags;
>> >
>


More information about the Intel-xe mailing list