[Mesa-dev] [PATCH] i965: Implement WaCsStallAtEveryFourthPipecontrol on IVB/BYT.
Ian Romanick
idr at freedesktop.org
Wed Nov 12 03:34:12 PST 2014
On 11/12/2014 10:39 AM, Daniel Vetter wrote:
> On Wed, Nov 12, 2014 at 01:33:01AM -0800, Kenneth Graunke wrote:
>> According to the documentation, we need to do a CS stall on every fourth
>> PIPE_CONTROL command to avoid GPU hangs. The kernel does a CS stall
>> between batches, so we only need to count the PIPE_CONTROLs in our batches.
>>
>> Signed-off-by: Kenneth Graunke <kenneth at whitecape.org>
>
> Yeah, kernel adds the CS stall bit both to the flush right before/after
> the batch so this works. The kernel also has a comment so people hopefully
> check userspace assumptions when testing this.
>
> Reviewed-by: Daniel Vetter <daniel.vetter at ffwll.ch>
>
> Some useless bikesheds for you to ignore below ;-)
>
>> ---
>> src/mesa/drivers/dri/i965/brw_context.h | 2 ++
>> src/mesa/drivers/dri/i965/intel_batchbuffer.c | 35 +++++++++++++++++++++++++++
>> 2 files changed, 37 insertions(+)
>>
>> This may help
>> https://code.google.com/p/chromium/issues/detail?id=333130
>> My theory is that marcheu's patch removes PIPE_CONTROLs that don't have
>> CS stalls, which may bring it under 4 in a row, or at least bring ones
>> with CS stalls closer together.
>>
>> It also may help the couple of users who've reported IVB GPU hangs recently.
>>
>> Or it may be totally useless...
>>
>> diff --git a/src/mesa/drivers/dri/i965/brw_context.h b/src/mesa/drivers/dri/i965/brw_context.h
>> index 656cbe8..27cf92c 100644
>> --- a/src/mesa/drivers/dri/i965/brw_context.h
>> +++ b/src/mesa/drivers/dri/i965/brw_context.h
>> @@ -854,6 +854,8 @@ struct intel_batchbuffer {
>> enum brw_gpu_ring ring;
>> bool needs_sol_reset;
>>
>> + uint8_t pipe_controls_since_last_cs_stall;
>
> Since the compile aligns this anyway I tend to not bother with smaller
> types. And the fixed-width generally used for abi and hw registers, which
> this isn't.
I think this will get stored in the padding after the bool. Making
this an int will cause extra padding. I think that's why Ken chose to
put it here.
>> +
>> struct {
>> uint16_t used;
>> int reloc_count;
>> diff --git a/src/mesa/drivers/dri/i965/intel_batchbuffer.c b/src/mesa/drivers/dri/i965/intel_batchbuffer.c
>> index cd45af6..2255cee 100644
>> --- a/src/mesa/drivers/dri/i965/intel_batchbuffer.c
>> +++ b/src/mesa/drivers/dri/i965/intel_batchbuffer.c
>> @@ -81,6 +81,7 @@ intel_batchbuffer_reset(struct brw_context *brw)
>> brw->batch.state_batch_offset = brw->batch.bo->size;
>> brw->batch.used = 0;
>> brw->batch.needs_sol_reset = false;
>> + brw->batch.pipe_controls_since_last_cs_stall = 0;
>>
>> /* We don't know what ring the new batch will be sent to until we see the
>> * first BEGIN_BATCH or BEGIN_BATCH_BLT. Mark it as unknown.
>> @@ -433,6 +434,36 @@ gen8_add_cs_stall_workaround_bits(uint32_t *flags)
>> *flags |= PIPE_CONTROL_STALL_AT_SCOREBOARD;
>> }
>>
>> +/* Implement the WaCsStallAtEveryFourthPipecontrol workaround on IVB, BYT:
>> + *
>> + * "Every 4th PIPE_CONTROL command, not counting the PIPE_CONTROL with
>> + * only read-cache-invalidate bit(s) set, must have a CS_STALL bit set."
>> + *
>> + * Note that the kernel does CS stalls between batches, so we only need
>> + * to count them within a batch.
>> + */
>> +static uint32_t
>> +gen7_cs_stall_every_four_pipe_controls(struct brw_context *brw, uint32_t flags)
>> +{
>> + if (brw->gen == 7 && brw->is_haswell) {
>> + if (flags & PIPE_CONTROL_CS_STALL) {
>> + /* If we're doing a CS stall, reset the counter and carry on. */
>> + brw->batch.pipe_controls_since_last_cs_stall = 0;
>> + return 0;
>> + } else {
>> + /* Otherwise, we need to count this PIPE_CONTROL. */
>> + ++brw->batch.pipe_controls_since_last_cs_stall;
>> + }
>
> You can flatten the control flow here a bit by dropping the else and
> moving the ++ into the check. Imo that's a fairly common patter to not
> obfuscate the code.
Yeah, I think that is slightly better... at least making it
/* Count this PIPE_CONTROL. */
brw->batch.pipe_controls_since_last_cs_stall++;
/* If this is the fourth pipe control without a CS stall, do one now. */
if (brw->batch.pipe_controls_since_last_cs_stall == 4) {
...
}
>> +
>> + /* If this is the fourth pipe control without a CS stall, do one now. */
>> + if (brw->batch.pipe_controls_since_last_cs_stall == 4) {
>> + brw->batch.pipe_controls_since_last_cs_stall = 0;
>> + return PIPE_CONTROL_CS_STALL;
>> + }
>> + }
>> + return 0;
>> +}
>> +
>> /**
>> * Emit a PIPE_CONTROL with various flushing flags.
>> *
>> @@ -454,6 +485,8 @@ brw_emit_pipe_control_flush(struct brw_context *brw, uint32_t flags)
>> OUT_BATCH(0);
>> ADVANCE_BATCH();
>> } else if (brw->gen >= 6) {
>> + flags |= gen7_cs_stall_every_four_pipe_controls(brw, flags);
>> +
>> BEGIN_BATCH(5);
>> OUT_BATCH(_3DSTATE_PIPE_CONTROL | (5 - 2));
>> OUT_BATCH(flags);
>> @@ -496,6 +529,8 @@ brw_emit_pipe_control_write(struct brw_context *brw, uint32_t flags,
>> OUT_BATCH(imm_upper);
>> ADVANCE_BATCH();
>> } else if (brw->gen >= 6) {
>> + flags |= gen7_cs_stall_every_four_pipe_controls(brw, flags);
>> +
>> /* PPGTT/GGTT is selected by DW2 bit 2 on Sandybridge, but DW1 bit 24
>> * on later platforms. We always use PPGTT on Gen7+.
>> */
>> --
>> 2.1.3
>>
>> _______________________________________________
>> mesa-dev mailing list
>> mesa-dev at lists.freedesktop.org
>> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
More information about the mesa-dev
mailing list