[Intel-gfx] [PATCH 06/11] drm/i915: Save all MMIO WAs and apply them at a later time

Joonas Lahtinen joonas.lahtinen at linux.intel.com
Wed Nov 8 11:47:10 UTC 2017


On Fri, 2017-11-03 at 15:56 -0700, Oscar Mateo wrote:
> 
> On 10/16/2017 03:34 AM, Joonas Lahtinen wrote:
> > On Fri, 2017-10-13 at 13:49 -0700, Oscar Mateo wrote:
> > > On 10/12/2017 03:35 AM, Joonas Lahtinen wrote:
> > > > On Wed, 2017-10-11 at 11:15 -0700, Oscar Mateo wrote:
> > > > > By doing this, we can dump these workarounds in debugfs for
> > > > > validation (which,
> > > > > at the moment, we are only able to do for the contexts WAs).
> > > > > 
> > > > > v2:
> > > > >     - Wrong macro used for MMIO set bit masked
> > > > >     - Improved naming
> > > > >     - Rebased
> > > > > 
> > > > > Signed-off-by: Oscar Mateo <oscar.mateo at intel.com>
> > > > > Cc: Chris Wilson <chris at chris-wilson.co.uk>
> > > > > Cc: Mika Kuoppala <mika.kuoppala at linux.intel.com>
> > > > 
> > > > <SNIP>
> > > > 
> > > > > +++ b/drivers/gpu/drm/i915/i915_drv.h
> > > > > @@ -1960,12 +1960,16 @@ struct i915_wa_reg {
> > > > >    	u32 mask;
> > > > >    };
> > > > >    
> > > > > -#define I915_MAX_WA_REGS 16
> > > > > +#define I915_MAX_CTX_WA_REGS 16
> > > > > +#define I915_MAX_MMIO_WA_REGS 32
> > > > >    
> > > > >    struct i915_workarounds {
> > > > > -	struct i915_wa_reg ctx_wa_reg[I915_MAX_WA_REGS];
> > > > > +	struct i915_wa_reg ctx_wa_reg[I915_MAX_CTX_WA_REGS];
> > > > >    	u32 ctx_wa_count;
> > > > >    
> > > > > +	struct i915_wa_reg mmio_wa_reg[I915_MAX_MMIO_WA_REGS];
> > > > > +	u32 mmio_wa_count;
> > > > > +
> > > > >    	u32 hw_whitelist_count[I915_NUM_ENGINES];
> > > > >    };
> > > > 
> > > > Could we instead consider a constant structure with platform bitmasks?
> > > > If that's not dynamic enough, then a bitmap which is initialized by the
> > > > platform bitmasks as a default. So instead of running code to add to
> > > > list, make it bit more declarative. Pseudo-code;
> > > > 
> > > > 
> > > > struct i915_workaround {
> > > > 	u16 gen_mask;
> > > > 	enum {
> > > > 		I915_WA_CTX,
> > > > 		I915_WA_MMIO,
> > > > 		I915_WA_WHITELIST,
> > > > 	} type;
> > > > 	u32 reg;
> > > > }
> > > > 
> > > > ... elsewhere in .c file
> > > > 
> > > > static const struct i915_workaround i915_workarounds[] = { {
> > > > 	/* WaSomethingSomewhereUMDFoo:skl */
> > > > 	.gen_mask = INTEL_GEN_MASK(X, Y),
> > > > 	.type = I915_WA_CTX,
> > > > 	.reg = ...
> > > > } };
> > > > 
> > > > Regards, Joonas
> > > 
> > > I considered it, but we have some workarounds that are even more dynamic
> > > than that. E.g.:
> > > 
> > > * WaCompressedResourceSamplerPbeMediaNewHashMode depends on
> > > HAS_LLC(dev_priv)
> > > * skl_tune_iz_hashing depends on number of subslices (although maybe
> > > this is not technically a WA?)
> > > * WaGttCachingOffByDefault needs HAS_PAGE_SIZES(dev_priv,
> > > I915_GTT_PAGE_SIZE_2M)
> > > * WaPsrDPRSUnmaskVBlankInSRD is applied for_each_pipe
> > 
> > Could be multiple Wa lines each with HAS_PIPE() condition.
> > 
> > > * Wa #1181 needs HAS_PCH_CNP(dev_priv)
> > > * ...
> > > 
> > > We even have a WA (WaTempDisableDOPClkGating) where the new design is
> > > not dynamic enough :(
> > 
> > I was thinking the array would cover the simple register writes, I
> > think most of the detection problems could be covered by a simple probe
> > function. One could consider caching the probing result to a bitmap.
> > 
> > > I guess we could add a callback pointer to the table for those WAs that
> > > need extra information. Maybe even a "pre" and a "post" callback
> > > pointers (to cover that pesky WaTempDisableDOPClkGating).
> > 
> > You'd still have to keep some state between pre and post. Maybe just
> > have I915_WA_FUNC and then a callback to apply the ones not fitting the
> > "detection" + "simple register write" scheme.
> > 
> 
> It took me some time, but I found the flaw in this design: I cannot use 
> a callback to apply the ones not fitting the"detection" + "simple 
> register write" scheme as you mention, because then debugfs will not 
> know about those (which was the point of this whole exercise).
> 
> I tried something like this:
> 
> typedef bool (* wa_pre_hook_func)(struct drm_i915_private *dev_priv,
>                    const struct i915_wa_reg *wa,
>                    u32 *mask, u32 *value, u32 *data);
> typedef void (* wa_post_hook_func)(struct drm_i915_private *dev_priv,
>                     const struct i915_wa_reg *wa,
>                     u32 data);
> 
> In case mask/value need to be recalculated (e.g. skl_tune_iz_hashing or 
> WaGttCachingOffByDefault) but then I need to call the pre_hook on 
> debugfs, which I might not want to do in all cases (see 
> WaProgramL3SqcReg1Default).
> I can special-case the problematic WAs, or even left them out of the 
> array, but somehow that seems worse than what we already had...
> 
> Thoughts?

I'm not sure I see the issue. If looking at gen8_set_l3sqc_credits

You'd have the specialized apply() callback which sets the
GEN8_L3SQCREG1 register respecting the WaTempDisableDOPClkGating.

When you have a WA for applying a WA, I don't see how we can completely
avoid the specialized functions.

For debugfs, wouldn't you dump the value of what GEN8_L3SQCREG1 is for
a specific dev_priv? If there's more than that, I guess we can't but
store that state in dev_priv. And have an optional hook function to
fill in whatever depends on runtime and we want displayed in debugfs?

I think it might be best to start with moving all the WAs to a file,
and then continue looking at how we can best present them.

Regards, Joonas

> 
> > > If this is where we want to go, I can write a patch, but I believe it
> > > would be better to land this first (code review is critical for these
> > > kind of changes, and it is easier to first review that all WAs make it
> > > to i915_workarounds.c correctly, and then review that they are all
> > > transformed to a static table correctly).
> > 
> > First collecting the WA code to same source file is a good idea if that
> > can be made conveniently, but I would not transform them to arrays or
> > some other intermediate form like this patch proposes. Instead after
> > the initial code motion, convert straight to the agreed format.
> > 
> > This is a good clarification to the WAs, so I like the general idea.
> > 
> > Regards, Joonas
> 
> 
-- 
Joonas Lahtinen
Open Source Technology Center
Intel Corporation


More information about the Intel-gfx mailing list