[Intel-gfx] [PATCH v2] drm/i915/cnl: WaPipeControlBefore3DStateSamplePattern
Chris Wilson
chris at chris-wilson.co.uk
Fri Jan 26 23:17:01 UTC 2018
Quoting Rafael Antognolli (2018-01-26 23:05:24)
> This workaround should prevent a bug that can be hit on a context
> restore. To avoid the issue, we must emit a PIPE_CONTROL with CS stall
> (0x7a000004 0x00100000 0x00000000 0x00000000) followed by 12DW's of
> NOOP(0x0) in the indirect context batch buffer, to ensure the engine is
> idle prior to programming 3DSTATE_SAMPLE_PATTERN.
>
> References: HSD#1939868
>
> v2:
> - More descriptive changelog and comments.
> - Fix math when counting dwords.
>
> Signed-off-by: Rafael Antognolli <rafael.antognolli at intel.com>
> Cc: Chris Wilson <chris at chris-wilson.co.uk>
> ---
> drivers/gpu/drm/i915/intel_lrc.c | 34 +++++++++++++++++++++++++++++++++-
> 1 file changed, 33 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> index 51e61b04a555..46c130d106d7 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> @@ -1366,6 +1366,36 @@ static u32 *gen9_init_indirectctx_bb(struct intel_engine_cs *engine, u32 *batch)
> return batch;
> }
>
> +static u32 *gen10_init_indirectctx_bb(struct intel_engine_cs *engine, u32 *batch)
> +{
> + int i;
> +
> + /*
> + * WaPipeControlBefore3DStateSamplePattern: cnl
> + *
> + * Ensure the engine is idle prior to programming a
> + * 3DSTATE_SAMPLE_PATTERN during a context restore.
> + */
> + batch = gen8_emit_pipe_control(batch,
> + PIPE_CONTROL_CS_STALL,
> + 0);
> + /*
> + * WaPipeControlBefore3DStateSamplePattern says we need 4 dwords for
> + * the PIPE_CONTROL followed by 12 dwords of 0x0, so 16 dwords in
> + * total. Since gen8_emit_pipe_control() already advances the batch by
> + * 6 dwords, we advance the other 10 here.
> + */
> + for (i = 0; i < 10; i++)
> + *batch++ = MI_NOOP;
If the spec says emit 12 nops, emit 12 nops. Otherwise if we add another
w/a here it may break this magic.
I am not going to ask what the magic nops do, it just gives a 72 cycle
delay in the CS parser. My guess is that they are trying to isolate a
cacheline, in which case they are expecting the pipecontrol to be
cacheline aligned. (I would get that clarified and add an assert to
document such a requirement.) Or they just mean that the indirect ctx is
cacheline aligned and the nops are just the normal padding and nothing
to do with the w/a.
s/10/12/ here and tweak the comment to remove the assumed cacheline
tail, but please see if someone can clarify if the nops are indeed part
of the w/a or just the normal filler.
With s/10/12/,
Acked-by: Chris Wilson <chris at chris-wilson.co.uk>
-Chris
More information about the Intel-gfx
mailing list