[Mesa-dev] [PATCH] i965: Implement WaCsStallAtEveryFourthPipecontrol on IVB/BYT.

Ian Romanick idr at freedesktop.org
Wed Nov 12 03:34:12 PST 2014


On 11/12/2014 10:39 AM, Daniel Vetter wrote:
> On Wed, Nov 12, 2014 at 01:33:01AM -0800, Kenneth Graunke wrote:
>> According to the documentation, we need to do a CS stall on every fourth
>> PIPE_CONTROL command to avoid GPU hangs.  The kernel does a CS stall
>> between batches, so we only need to count the PIPE_CONTROLs in our batches.
>>
>> Signed-off-by: Kenneth Graunke <kenneth at whitecape.org>
> 
> Yeah, kernel adds the CS stall bit both to the flush right before/after
> the batch so this works. The kernel also has a comment so people hopefully
> check userspace assumptions when testing this.
> 
> Reviewed-by: Daniel Vetter <daniel.vetter at ffwll.ch>
> 
> Some useless bikesheds for you to ignore below ;-)
> 
>> ---
>>  src/mesa/drivers/dri/i965/brw_context.h       |  2 ++
>>  src/mesa/drivers/dri/i965/intel_batchbuffer.c | 35 +++++++++++++++++++++++++++
>>  2 files changed, 37 insertions(+)
>>
>> This may help
>> https://code.google.com/p/chromium/issues/detail?id=333130
>> My theory is that marcheu's patch removes PIPE_CONTROLs that don't have
>> CS stalls, which may bring it under 4 in a row, or at least bring ones
>> with CS stalls closer together.
>>
>> It also may help the couple of users who've reported IVB GPU hangs recently.
>>
>> Or it may be totally useless...
>>
>> diff --git a/src/mesa/drivers/dri/i965/brw_context.h b/src/mesa/drivers/dri/i965/brw_context.h
>> index 656cbe8..27cf92c 100644
>> --- a/src/mesa/drivers/dri/i965/brw_context.h
>> +++ b/src/mesa/drivers/dri/i965/brw_context.h
>> @@ -854,6 +854,8 @@ struct intel_batchbuffer {
>>     enum brw_gpu_ring ring;
>>     bool needs_sol_reset;
>>  
>> +   uint8_t pipe_controls_since_last_cs_stall;
> 
> Since the compile aligns this anyway I tend to not bother with smaller
> types. And the fixed-width generally used for abi and hw registers, which
> this isn't.

I think this will get stored in the padding after the bool.  Making
this an int will cause extra padding.  I think that's why Ken chose to
put it here.

>> +
>>     struct {
>>        uint16_t used;
>>        int reloc_count;
>> diff --git a/src/mesa/drivers/dri/i965/intel_batchbuffer.c b/src/mesa/drivers/dri/i965/intel_batchbuffer.c
>> index cd45af6..2255cee 100644
>> --- a/src/mesa/drivers/dri/i965/intel_batchbuffer.c
>> +++ b/src/mesa/drivers/dri/i965/intel_batchbuffer.c
>> @@ -81,6 +81,7 @@ intel_batchbuffer_reset(struct brw_context *brw)
>>     brw->batch.state_batch_offset = brw->batch.bo->size;
>>     brw->batch.used = 0;
>>     brw->batch.needs_sol_reset = false;
>> +   brw->batch.pipe_controls_since_last_cs_stall = 0;
>>  
>>     /* We don't know what ring the new batch will be sent to until we see the
>>      * first BEGIN_BATCH or BEGIN_BATCH_BLT.  Mark it as unknown.
>> @@ -433,6 +434,36 @@ gen8_add_cs_stall_workaround_bits(uint32_t *flags)
>>        *flags |= PIPE_CONTROL_STALL_AT_SCOREBOARD;
>>  }
>>  
>> +/* Implement the WaCsStallAtEveryFourthPipecontrol workaround on IVB, BYT:
>> + *
>> + * "Every 4th PIPE_CONTROL command, not counting the PIPE_CONTROL with
>> + *  only read-cache-invalidate bit(s) set, must have a CS_STALL bit set."
>> + *
>> + * Note that the kernel does CS stalls between batches, so we only need
>> + * to count them within a batch.
>> + */
>> +static uint32_t
>> +gen7_cs_stall_every_four_pipe_controls(struct brw_context *brw, uint32_t flags)
>> +{
>> +   if (brw->gen == 7 && brw->is_haswell) {
>> +      if (flags & PIPE_CONTROL_CS_STALL) {
>> +         /* If we're doing a CS stall, reset the counter and carry on. */
>> +         brw->batch.pipe_controls_since_last_cs_stall = 0;
>> +         return 0;
>> +      } else {
>> +         /* Otherwise, we need to count this PIPE_CONTROL. */
>> +         ++brw->batch.pipe_controls_since_last_cs_stall;
>> +      }
> 
> You can flatten the control flow here a bit by dropping the else and
> moving the ++ into the check. Imo that's a fairly common patter to not
> obfuscate the code.

Yeah, I think that is slightly better... at least making it

    /* Count this PIPE_CONTROL. */
    brw->batch.pipe_controls_since_last_cs_stall++;

    /* If this is the fourth pipe control without a CS stall, do one now. */
    if (brw->batch.pipe_controls_since_last_cs_stall == 4) {
        ...
    }

>> +
>> +      /* If this is the fourth pipe control without a CS stall, do one now. */
>> +      if (brw->batch.pipe_controls_since_last_cs_stall == 4) {
>> +         brw->batch.pipe_controls_since_last_cs_stall = 0;
>> +         return PIPE_CONTROL_CS_STALL;
>> +      }
>> +   }
>> +   return 0;
>> +}
>> +
>>  /**
>>   * Emit a PIPE_CONTROL with various flushing flags.
>>   *
>> @@ -454,6 +485,8 @@ brw_emit_pipe_control_flush(struct brw_context *brw, uint32_t flags)
>>        OUT_BATCH(0);
>>        ADVANCE_BATCH();
>>     } else if (brw->gen >= 6) {
>> +      flags |= gen7_cs_stall_every_four_pipe_controls(brw, flags);
>> +
>>        BEGIN_BATCH(5);
>>        OUT_BATCH(_3DSTATE_PIPE_CONTROL | (5 - 2));
>>        OUT_BATCH(flags);
>> @@ -496,6 +529,8 @@ brw_emit_pipe_control_write(struct brw_context *brw, uint32_t flags,
>>        OUT_BATCH(imm_upper);
>>        ADVANCE_BATCH();
>>     } else if (brw->gen >= 6) {
>> +      flags |= gen7_cs_stall_every_four_pipe_controls(brw, flags);
>> +
>>        /* PPGTT/GGTT is selected by DW2 bit 2 on Sandybridge, but DW1 bit 24
>>         * on later platforms.  We always use PPGTT on Gen7+.
>>         */
>> -- 
>> 2.1.3
>>
>> _______________________________________________
>> mesa-dev mailing list
>> mesa-dev at lists.freedesktop.org
>> http://lists.freedesktop.org/mailman/listinfo/mesa-dev



More information about the mesa-dev mailing list