[Intel-gfx] [RFC PATCH 04/20] drm/i915: Transform context WAs into static tables
Oscar Mateo
oscar.mateo at intel.com
Mon Nov 6 18:54:04 UTC 2017
On 11/06/2017 03:59 AM, Joonas Lahtinen wrote:
> On Fri, 2017-11-03 at 11:09 -0700, Oscar Mateo wrote:
>> This is for WAs that need to touch registers that get saved/restored
>> together with the logical context. The idea is that WAs are "pretty"
>> static, so a table is more declarative than a programmatic approah.
>> Note however that some amount is caching is needed for those things
>> that are dynamic (e.g. things that need some calculation, or have
>> a criteria different than the more obvious GEN + stepping).
>>
>> Also, this makes very explicit which WAs live in the context.
>>
>> Suggested-by: Joonas Lahtinen <joonas.lahtinen at linux.intel.com>
>> Signed-off-by: Oscar Mateo <oscar.mateo at intel.com>
>> Cc: Chris Wilson <chris at chris-wilson.co.uk>
>> Cc: Mika Kuoppala <mika.kuoppala at linux.intel.com>
> <SNIP>
>
>> +struct i915_wa_reg;
>> +
>> +typedef bool (* wa_pre_hook_func)(struct drm_i915_private *dev_priv,
>> + struct i915_wa_reg *wa);
>> +typedef void (* wa_post_hook_func)(struct drm_i915_private *dev_priv,
>> + struct i915_wa_reg *wa);
> To avoid carrying any variables over, how about just apply() hook?
> Also, you don't have to have "_hook" going there, it's tak
>
Not all WAs are applied in the same way: ctx-style workarounds are
emitted as LRI commands to the ring. Do you treat those differently?
>> struct i915_wa_reg {
>> + const char *name;
> We may want some Kconfig option for skipping these.
Sure. But we should try to decide first if we want to store this at all,
like: what do we expect to use this for? is it worth it?
>> + enum wa_type {
>> + I915_WA_TYPE_CONTEXT = 0,
>> + I915_WA_TYPE_GT,
>> + I915_WA_TYPE_DISPLAY,
>> + I915_WA_TYPE_WHITELIST
>> + } type;
>> +
> Any specific reason not to have the gen here too? Then you can have one
> big table, instead of tables of tables. Then the numeric code of a WA
> (position in that table) would be equally identifying it compared to
> the WA name (which is nice to have information, so config time opt-in).
Such a "big table" would be quite big, indeed. And we know we want to
apply the workarounds from at least four different places, so looping
through the table each and every time to find the relevant WAs seems
like a waste. Also, in some places we would have to loop more than once
( to know the number of WAs to apply before we can reserve space in the
ring for ctx-style WAs, for example).
I could also go for 4 slightly smaller tables (one per type of WA) but
then there is another problem to solve: how do you record WAs that apply
for all revisions of one GEN, but a smaller number of revisions of
another? (e.g. WaDisableFenceDestinationToSLM applies to all BDW
steppings but only KBL A0).
>> + u8 since;
>> + u8 until;
> Most seem to have ALL_REVS, so this could be after the coarse-grained
> gen-check in the apply function.
So every single WA that applies to specific REVS gets an "apply"
function? That looks like a lot of functions (I count 25 WAs that only
apply to some steppings already). Or are you simply saying here that I
check the GEN before checking the stepping (which is the only order that
makes sense anyway)?
>> +
>> i915_reg_t addr;
>> - u32 value;
>> - /* bitmask representing WA bits */
>> u32 mask;
>> + u32 value;
>> + bool is_masked_reg;
> I'd hide this detail into the apply function.
I see. But if you don't store the mask: what do you output in debugfs?
>
>> +
>> + wa_pre_hook_func pre_hook;
>> + wa_post_hook_func post_hook;
> bool (*apply)(const struct i915_wa *wa,
> struct drm_i915_private *dev_priv);
>
>> + u32 hook_data;
>> + bool applied;
> The big point would be to make this into const, so "applied" would
> defeat that.
Yeah, I realized. Keeping a separate bitmask of which WAs have been
applied is not a big deal, but then I became aware that there are many
more things that would need to be cached. For example, some WAs require
to compute the actual value you write into their register. What do you
do with those? (remember that you still want to print the expected value
in debugfs for these).
> <SNIP>
>
>> +#define MASK(mask, value) ((mask) << 16 | (value))
>> +#define MASK_ENABLE(x) (MASK((x), (x)))
>> +#define MASK_DISABLE(x) (MASK((x), 0))
>>
>> -#define WA_REG(addr, mask, val) do { \
>> - const int r = wa_add(dev_priv, (addr), (mask), (val)); \
>> - if (r) \
>> - return r; \
>> - } while (0)
>> +#define SET_BIT_MASKED(m) \
>> + .mask = (m), \
>> + .value = MASK_ENABLE(m), \
>> + .is_masked_reg = true
>>
>> -#define WA_SET_BIT_MASKED(addr, mask) \
>> - WA_REG(addr, (mask), _MASKED_BIT_ENABLE(mask))
>> +#define CLEAR_BIT_MASKED( m) \
>> + .mask = (m), \
>> + .value = MASK_DISABLE(m), \
>> + .is_masked_reg = true
>>
>> -#define WA_CLR_BIT_MASKED(addr, mask) \
>> - WA_REG(addr, (mask), _MASKED_BIT_DISABLE(mask))
>> +#define SET_FIELD_MASKED(m, v) \
>> + .mask = (m), \
>> + .value = MASK(m, v), \
>> + .is_masked_reg = true
> Lets try to have the struct i915_wa as small as possible, so this could
> be calculated in the apply function.
>
> So, avoiding the macros this would indeed become rather declarative;
>
> {
> WA_NAME("WaDisableAsyncFlipPerfMode")
> .gen = ...,
> .reg = MI_MODE,
> .value = ASYNC_FLIP_PERF_DISABLE,
> .apply = set_bit_masked,
> },
> Or, we could also have;
>
> static const struct i915_wa WaDisableAsyncFlipPerfMode = {
> .gen = ...,
> .reg = MI_MODE,
> .value = ASYNC_FLIP_PERF_DISABLE,
> .apply = set_bit_masked,
> };
>
> And then one array of those.
>
> WA(WaDisableAsyncFlipPerfMode),
This is the list of problems we need to solve before we can go forward
with this design:
- What to do with WAs that don't know a priori what .value should be,
because it gets computed in places like skl_tune_iz_hashing or
use_gtt_cache? (yes, computing in the apply function is the immediate
answer, but then... how do you output that in debugfs?).
- What to do with context-style WAs, that are emitted instead of
applied, as I mentioned above?.
- What to do with whitelist-style functions, where you need to access
the .reg field of i915_reg_t to know the .value? Also, the .reg depends
on the engine (although I guess you can always statically codify that in
the table and apply the whitelist WAs later, once all the engines are up).
- You are not storing .since/.until. Does that mean every WA that
applies to only some steppings gets a custom apply function?.
- If you don't store the computed mask anywhere, what do you output in
debugfs? (which is the real improvement we want to achieve?).
- Something to be careful about: some WAs are named the same, but their
reg/value is different (because the register has changed in one
particular GEN or whatever). The solution could be a modifier to the
name (WaSomething_bdw_chv and WaSomething_skl) but this could be a
source of errors.
> Then you could at compile time decide if you stringify and store the
> name. But that'd be more const data than necessary (pointers to
> structs, instead of an array of structs).
>
> Regards, Joonas
One more thing: I still urge to reconsider merging what we already have,
and doing these improvements (once we agree on a design) later on. The
reason being that the sooner we get a list of all WAs in debugfs, the
better (which can be used later on to verify any further improvements we
do).
Thanks for the review,
Oscar
More information about the Intel-gfx
mailing list