[Mesa-dev] [PATCH] i965: Implement WaCsStallAtEveryFourthPipecontrol on IVB/BYT.
Daniel Vetter
daniel at ffwll.ch
Wed Nov 12 02:39:28 PST 2014
On Wed, Nov 12, 2014 at 01:33:01AM -0800, Kenneth Graunke wrote:
> According to the documentation, we need to do a CS stall on every fourth
> PIPE_CONTROL command to avoid GPU hangs. The kernel does a CS stall
> between batches, so we only need to count the PIPE_CONTROLs in our batches.
>
> Signed-off-by: Kenneth Graunke <kenneth at whitecape.org>
Yeah, kernel adds the CS stall bit both to the flush right before/after
the batch so this works. The kernel also has a comment so people hopefully
check userspace assumptions when testing this.
Reviewed-by: Daniel Vetter <daniel.vetter at ffwll.ch>
Some useless bikesheds for you to ignore below ;-)
> ---
> src/mesa/drivers/dri/i965/brw_context.h | 2 ++
> src/mesa/drivers/dri/i965/intel_batchbuffer.c | 35 +++++++++++++++++++++++++++
> 2 files changed, 37 insertions(+)
>
> This may help
> https://code.google.com/p/chromium/issues/detail?id=333130
> My theory is that marcheu's patch removes PIPE_CONTROLs that don't have
> CS stalls, which may bring it under 4 in a row, or at least bring ones
> with CS stalls closer together.
>
> It also may help the couple of users who've reported IVB GPU hangs recently.
>
> Or it may be totally useless...
>
> diff --git a/src/mesa/drivers/dri/i965/brw_context.h b/src/mesa/drivers/dri/i965/brw_context.h
> index 656cbe8..27cf92c 100644
> --- a/src/mesa/drivers/dri/i965/brw_context.h
> +++ b/src/mesa/drivers/dri/i965/brw_context.h
> @@ -854,6 +854,8 @@ struct intel_batchbuffer {
> enum brw_gpu_ring ring;
> bool needs_sol_reset;
>
> + uint8_t pipe_controls_since_last_cs_stall;
Since the compile aligns this anyway I tend to not bother with smaller
types. And the fixed-width generally used for abi and hw registers, which
this isn't.
> +
> struct {
> uint16_t used;
> int reloc_count;
> diff --git a/src/mesa/drivers/dri/i965/intel_batchbuffer.c b/src/mesa/drivers/dri/i965/intel_batchbuffer.c
> index cd45af6..2255cee 100644
> --- a/src/mesa/drivers/dri/i965/intel_batchbuffer.c
> +++ b/src/mesa/drivers/dri/i965/intel_batchbuffer.c
> @@ -81,6 +81,7 @@ intel_batchbuffer_reset(struct brw_context *brw)
> brw->batch.state_batch_offset = brw->batch.bo->size;
> brw->batch.used = 0;
> brw->batch.needs_sol_reset = false;
> + brw->batch.pipe_controls_since_last_cs_stall = 0;
>
> /* We don't know what ring the new batch will be sent to until we see the
> * first BEGIN_BATCH or BEGIN_BATCH_BLT. Mark it as unknown.
> @@ -433,6 +434,36 @@ gen8_add_cs_stall_workaround_bits(uint32_t *flags)
> *flags |= PIPE_CONTROL_STALL_AT_SCOREBOARD;
> }
>
> +/* Implement the WaCsStallAtEveryFourthPipecontrol workaround on IVB, BYT:
> + *
> + * "Every 4th PIPE_CONTROL command, not counting the PIPE_CONTROL with
> + * only read-cache-invalidate bit(s) set, must have a CS_STALL bit set."
> + *
> + * Note that the kernel does CS stalls between batches, so we only need
> + * to count them within a batch.
> + */
> +static uint32_t
> +gen7_cs_stall_every_four_pipe_controls(struct brw_context *brw, uint32_t flags)
> +{
> + if (brw->gen == 7 && brw->is_haswell) {
> + if (flags & PIPE_CONTROL_CS_STALL) {
> + /* If we're doing a CS stall, reset the counter and carry on. */
> + brw->batch.pipe_controls_since_last_cs_stall = 0;
> + return 0;
> + } else {
> + /* Otherwise, we need to count this PIPE_CONTROL. */
> + ++brw->batch.pipe_controls_since_last_cs_stall;
> + }
You can flatten the control flow here a bit by dropping the else and
moving the ++ into the check. Imo that's a fairly common patter to not
obfuscate the code.
> +
> + /* If this is the fourth pipe control without a CS stall, do one now. */
> + if (brw->batch.pipe_controls_since_last_cs_stall == 4) {
> + brw->batch.pipe_controls_since_last_cs_stall = 0;
> + return PIPE_CONTROL_CS_STALL;
> + }
> + }
> + return 0;
> +}
> +
> /**
> * Emit a PIPE_CONTROL with various flushing flags.
> *
> @@ -454,6 +485,8 @@ brw_emit_pipe_control_flush(struct brw_context *brw, uint32_t flags)
> OUT_BATCH(0);
> ADVANCE_BATCH();
> } else if (brw->gen >= 6) {
> + flags |= gen7_cs_stall_every_four_pipe_controls(brw, flags);
> +
> BEGIN_BATCH(5);
> OUT_BATCH(_3DSTATE_PIPE_CONTROL | (5 - 2));
> OUT_BATCH(flags);
> @@ -496,6 +529,8 @@ brw_emit_pipe_control_write(struct brw_context *brw, uint32_t flags,
> OUT_BATCH(imm_upper);
> ADVANCE_BATCH();
> } else if (brw->gen >= 6) {
> + flags |= gen7_cs_stall_every_four_pipe_controls(brw, flags);
> +
> /* PPGTT/GGTT is selected by DW2 bit 2 on Sandybridge, but DW1 bit 24
> * on later platforms. We always use PPGTT on Gen7+.
> */
> --
> 2.1.3
>
> _______________________________________________
> mesa-dev mailing list
> mesa-dev at lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
--
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
More information about the mesa-dev
mailing list