[Mesa-dev] [PATCH] i965: Implement WaCsStallAtEveryFourthPipecontrol on IVB/BYT.

Daniel Vetter daniel at ffwll.ch
Wed Nov 12 02:39:28 PST 2014


On Wed, Nov 12, 2014 at 01:33:01AM -0800, Kenneth Graunke wrote:
> According to the documentation, we need to do a CS stall on every fourth
> PIPE_CONTROL command to avoid GPU hangs.  The kernel does a CS stall
> between batches, so we only need to count the PIPE_CONTROLs in our batches.
> 
> Signed-off-by: Kenneth Graunke <kenneth at whitecape.org>

Yeah, kernel adds the CS stall bit both to the flush right before/after
the batch so this works. The kernel also has a comment so people hopefully
check userspace assumptions when testing this.

Reviewed-by: Daniel Vetter <daniel.vetter at ffwll.ch>

Some useless bikesheds for you to ignore below ;-)

> ---
>  src/mesa/drivers/dri/i965/brw_context.h       |  2 ++
>  src/mesa/drivers/dri/i965/intel_batchbuffer.c | 35 +++++++++++++++++++++++++++
>  2 files changed, 37 insertions(+)
> 
> This may help
> https://code.google.com/p/chromium/issues/detail?id=333130
> My theory is that marcheu's patch removes PIPE_CONTROLs that don't have
> CS stalls, which may bring it under 4 in a row, or at least bring ones
> with CS stalls closer together.
> 
> It also may help the couple of users who've reported IVB GPU hangs recently.
> 
> Or it may be totally useless...
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_context.h b/src/mesa/drivers/dri/i965/brw_context.h
> index 656cbe8..27cf92c 100644
> --- a/src/mesa/drivers/dri/i965/brw_context.h
> +++ b/src/mesa/drivers/dri/i965/brw_context.h
> @@ -854,6 +854,8 @@ struct intel_batchbuffer {
>     enum brw_gpu_ring ring;
>     bool needs_sol_reset;
>  
> +   uint8_t pipe_controls_since_last_cs_stall;

Since the compile aligns this anyway I tend to not bother with smaller
types. And the fixed-width generally used for abi and hw registers, which
this isn't.

> +
>     struct {
>        uint16_t used;
>        int reloc_count;
> diff --git a/src/mesa/drivers/dri/i965/intel_batchbuffer.c b/src/mesa/drivers/dri/i965/intel_batchbuffer.c
> index cd45af6..2255cee 100644
> --- a/src/mesa/drivers/dri/i965/intel_batchbuffer.c
> +++ b/src/mesa/drivers/dri/i965/intel_batchbuffer.c
> @@ -81,6 +81,7 @@ intel_batchbuffer_reset(struct brw_context *brw)
>     brw->batch.state_batch_offset = brw->batch.bo->size;
>     brw->batch.used = 0;
>     brw->batch.needs_sol_reset = false;
> +   brw->batch.pipe_controls_since_last_cs_stall = 0;
>  
>     /* We don't know what ring the new batch will be sent to until we see the
>      * first BEGIN_BATCH or BEGIN_BATCH_BLT.  Mark it as unknown.
> @@ -433,6 +434,36 @@ gen8_add_cs_stall_workaround_bits(uint32_t *flags)
>        *flags |= PIPE_CONTROL_STALL_AT_SCOREBOARD;
>  }
>  
> +/* Implement the WaCsStallAtEveryFourthPipecontrol workaround on IVB, BYT:
> + *
> + * "Every 4th PIPE_CONTROL command, not counting the PIPE_CONTROL with
> + *  only read-cache-invalidate bit(s) set, must have a CS_STALL bit set."
> + *
> + * Note that the kernel does CS stalls between batches, so we only need
> + * to count them within a batch.
> + */
> +static uint32_t
> +gen7_cs_stall_every_four_pipe_controls(struct brw_context *brw, uint32_t flags)
> +{
> +   if (brw->gen == 7 && brw->is_haswell) {
> +      if (flags & PIPE_CONTROL_CS_STALL) {
> +         /* If we're doing a CS stall, reset the counter and carry on. */
> +         brw->batch.pipe_controls_since_last_cs_stall = 0;
> +         return 0;
> +      } else {
> +         /* Otherwise, we need to count this PIPE_CONTROL. */
> +         ++brw->batch.pipe_controls_since_last_cs_stall;
> +      }

You can flatten the control flow here a bit by dropping the else and
moving the ++ into the check. Imo that's a fairly common patter to not
obfuscate the code.

> +
> +      /* If this is the fourth pipe control without a CS stall, do one now. */
> +      if (brw->batch.pipe_controls_since_last_cs_stall == 4) {
> +         brw->batch.pipe_controls_since_last_cs_stall = 0;
> +         return PIPE_CONTROL_CS_STALL;
> +      }
> +   }
> +   return 0;
> +}
> +
>  /**
>   * Emit a PIPE_CONTROL with various flushing flags.
>   *
> @@ -454,6 +485,8 @@ brw_emit_pipe_control_flush(struct brw_context *brw, uint32_t flags)
>        OUT_BATCH(0);
>        ADVANCE_BATCH();
>     } else if (brw->gen >= 6) {
> +      flags |= gen7_cs_stall_every_four_pipe_controls(brw, flags);
> +
>        BEGIN_BATCH(5);
>        OUT_BATCH(_3DSTATE_PIPE_CONTROL | (5 - 2));
>        OUT_BATCH(flags);
> @@ -496,6 +529,8 @@ brw_emit_pipe_control_write(struct brw_context *brw, uint32_t flags,
>        OUT_BATCH(imm_upper);
>        ADVANCE_BATCH();
>     } else if (brw->gen >= 6) {
> +      flags |= gen7_cs_stall_every_four_pipe_controls(brw, flags);
> +
>        /* PPGTT/GGTT is selected by DW2 bit 2 on Sandybridge, but DW1 bit 24
>         * on later platforms.  We always use PPGTT on Gen7+.
>         */
> -- 
> 2.1.3
> 
> _______________________________________________
> mesa-dev mailing list
> mesa-dev at lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev

-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch


More information about the mesa-dev mailing list