[Intel-gfx] [PATCH 5/6] drm/i915: Don't use scratch in WA batch.

Thu Sep 26 13:31:36 UTC 2019

Quoting Michał Winiarski (2019-09-26 11:06:34)
> We're currently doing one workaround where we're using scratch as a
> temporary storage place, while we're overwriting the value of one
> register with some known constant value in order to perform a
> workaround.
> While we could just do similar thing with CS_GPR register
> and MI_LOAD_REGISTER_REG instead of scratch, since we would use CS_GPR
> anyways, let's just drop the constant values and do the bitops using
> MI_MATH.
> 
> Signed-off-by: Michał Winiarski <michal.winiarski at intel.com>
> Cc: Chris Wilson <chris at chris-wilson.co.uk>
> Cc: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
> ---
>  drivers/gpu/drm/i915/gt/intel_engine.h   | 53 ++++++++++++++++++++++++
>  drivers/gpu/drm/i915/gt/intel_gt_types.h |  3 --
>  drivers/gpu/drm/i915/gt/intel_lrc.c      | 27 +++---------
>  3 files changed, 58 insertions(+), 25 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_engine.h b/drivers/gpu/drm/i915/gt/intel_engine.h
> index d3c6993f4f46..2dfe0b23af1d 100644
> --- a/drivers/gpu/drm/i915/gt/intel_engine.h
> +++ b/drivers/gpu/drm/i915/gt/intel_engine.h
> @@ -400,6 +400,59 @@ gen8_emit_ggtt_write(u32 *cs, u32 value, u32 gtt_offset, u32 flags)
>         return cs;
>  }
>  
> +/*
> + * We're using CS_GPR registers for the MI_MATH ops.
> + * Note that user contexts are free to use those registers, which means that we
> + * should only use this this function during context initialization, before
> + * context restore (WA batch) or inside i915-owned contexts.
> + */
> +static inline u32 *
> +gen8_emit_bitop_reg_mask(struct intel_engine_cs *engine,
> +                        u32 *cs, u32 op, i915_reg_t reg, u32 mask)

Useful, we had discussed using MI_MATH to perform rmw for our
workarounds.

> +{
> +       const u32 base = engine->mmio_base;
> +
> +       GEM_BUG_ON(op != MI_MATH_OR && op != MI_MATH_AND);
> +
> +       *cs++ = MI_LOAD_REGISTER_REG;
> +       *cs++ = i915_mmio_reg_offset(reg);
> +       *cs++ = i915_mmio_reg_offset(GEN8_RING_CS_GPR(base, 0));
> +       *cs++ = MI_NOOP;
> +
> +       *cs++ = MI_LOAD_REGISTER_IMM(1);
> +       *cs++ = i915_mmio_reg_offset(GEN8_RING_CS_GPR(base, 1));
> +       *cs++ = mask;
> +       *cs++ = MI_NOOP;
> +
> +       *cs++ = MI_MATH(4);
> +       *cs++ = MI_MATH_LOAD(MI_MATH_REG_SRCA, MI_MATH_REG(0));
> +       *cs++ = MI_MATH_LOAD(MI_MATH_REG_SRCB, MI_MATH_REG(1));
> +       *cs++ = op;
> +       *cs++ = MI_MATH_STORE(MI_MATH_REG(0), MI_MATH_REG_ACCU);
> +       *cs++ = MI_NOOP;
> +
> +       *cs++ = MI_LOAD_REGISTER_REG;
> +       *cs++ = i915_mmio_reg_offset(GEN8_RING_CS_GPR(base, 0));
> +       *cs++ = i915_mmio_reg_offset(reg);
> +       *cs++ = MI_NOOP;

4 nicely paired redundant MI_NOOP. Can be removed, leaving the packet
still oword aligned.
-Chris