[Mesa-stable] [Mesa-dev] [PATCH V2 1/4] i965/gen10: Implement WaSampleOffsetIZ workaround
Nanley Chery
nanleychery at gmail.com
Tue Oct 31 20:17:48 UTC 2017
On Mon, Oct 30, 2017 at 05:24:10PM -0700, Nanley Chery wrote:
> On Fri, Oct 06, 2017 at 04:30:47PM -0700, Anuj Phogat wrote:
> > There are few other (duplicate) workarounds which have similar recommendations:
> > WaFlushHangWhenNonPipelineStateAndMarkerStalled
> > WaCSStallBefore3DSamplePattern
> > WaPipeControlBefore3DStateSamplePattern
> >
> > WaPipeControlBefore3DStateSamplePattern has some extra recommendations if
> > driver is using mid batch context restore. Ignoring it for now because We're
> > not doing mid-batch context restore in Mesa.
> >
>
> How do we know we've implemented this correctly? Does this fix or
> improve any hangs we've been seeing? It would be helpful to have this
> information in the commit message.
>
> > Cc: mesa-stable at lists.freedesktop.org
> > Cc: Jason Ekstrand <jason at jlekstrand.net>
> > Cc: Rafael Antognolli <rafael.antognolli at intel.com>
> > Signed-off-by: Anuj Phogat <anuj.phogat at gmail.com>
> > ---
> > src/mesa/drivers/dri/i965/brw_context.h | 2 +
> > src/mesa/drivers/dri/i965/brw_defines.h | 1 +
> > src/mesa/drivers/dri/i965/brw_pipe_control.c | 50 ++++++++++++++++++++++
> > src/mesa/drivers/dri/i965/gen8_multisample_state.c | 8 ++++
> > 4 files changed, 61 insertions(+)
> >
> > diff --git a/src/mesa/drivers/dri/i965/brw_context.h b/src/mesa/drivers/dri/i965/brw_context.h
> > index 92fc16de13..f0e8d562e9 100644
> > --- a/src/mesa/drivers/dri/i965/brw_context.h
> > +++ b/src/mesa/drivers/dri/i965/brw_context.h
> > @@ -1647,6 +1647,8 @@ void brw_emit_post_sync_nonzero_flush(struct brw_context *brw);
> > void brw_emit_depth_stall_flushes(struct brw_context *brw);
> > void gen7_emit_vs_workaround_flush(struct brw_context *brw);
> > void gen7_emit_cs_stall_flush(struct brw_context *brw);
> > +void gen10_emit_wa_cs_stall_flush(struct brw_context *brw);
> > +void gen10_emit_wa_lri_to_cache_mode_zero(struct brw_context *brw);
> >
> > /* brw_queryformat.c */
> > void brw_query_internal_format(struct gl_context *ctx, GLenum target,
> > diff --git a/src/mesa/drivers/dri/i965/brw_defines.h b/src/mesa/drivers/dri/i965/brw_defines.h
> > index 4abb790612..270cdf29db 100644
> > --- a/src/mesa/drivers/dri/i965/brw_defines.h
> > +++ b/src/mesa/drivers/dri/i965/brw_defines.h
> > @@ -1609,6 +1609,7 @@ enum brw_pixel_shader_coverage_mask_mode {
> > #define GEN7_GPGPU_DISPATCHDIMY 0x2504
> > #define GEN7_GPGPU_DISPATCHDIMZ 0x2508
> >
> > +#define GEN7_CACHE_MODE_0 0x7000
> > #define GEN7_CACHE_MODE_1 0x7004
>
> > # define GEN9_FLOAT_BLEND_OPTIMIZATION_ENABLE (1 << 4)
> > # define GEN8_HIZ_NP_PMA_FIX_ENABLE (1 << 11)
> > diff --git a/src/mesa/drivers/dri/i965/brw_pipe_control.c b/src/mesa/drivers/dri/i965/brw_pipe_control.c
> > index 460b8f73b6..156f5c25ec 100644
> > --- a/src/mesa/drivers/dri/i965/brw_pipe_control.c
> > +++ b/src/mesa/drivers/dri/i965/brw_pipe_control.c
> > @@ -278,6 +278,56 @@ gen7_emit_cs_stall_flush(struct brw_context *brw)
> > brw->workaround_bo, 0, 0);
> > }
> >
> > +static void
> > +brw_flush_gpu_caches(struct brw_context *brw) {
> > + brw_emit_pipe_control_flush(brw,
> > + PIPE_CONTROL_CACHE_FLUSH_BITS |
> > + PIPE_CONTROL_CACHE_INVALIDATE_BITS);
> > +}
> > +
> > +/**
> > + * From Gen10 Workarounds page in h/w specs:
> > + * WaSampleOffsetIZ:
> > + * Prior to the 3DSTATE_SAMPLE_PATTERN driver must ensure there are no
> > + * markers in the pipeline by programming a PIPE_CONTROL with stall.
> > + */
> > +void
> > +gen10_emit_wa_cs_stall_flush(struct brw_context *brw)
> > +{
> > + const struct gen_device_info *devinfo = &brw->screen->devinfo;
> > + assert(devinfo->gen == 10);
> > + brw_emit_pipe_control_flush(brw,
> > + PIPE_CONTROL_CS_STALL |
> > + PIPE_CONTROL_STALL_AT_SCOREBOARD);
> > +}
> > +
> > +/**
> > + * From Gen10 Workarounds page in h/w specs:
> > + * WaSampleOffsetIZ:
> > + * When 3DSTATE_SAMPLE_PATTERN is programmed, driver must then issue an
> > + * MI_LOAD_REGISTER_IMM command to an offset between 0x7000 and 0x7FFF(SVL)
> > + * after the command to ensure the state has been delivered prior to any
> > + * command causing a marker in the pipeline.
> > + */
> > +void
> > +gen10_emit_wa_lri_to_cache_mode_zero(struct brw_context *brw)
> > +{
> > + const struct gen_device_info *devinfo = &brw->screen->devinfo;
> > + assert(devinfo->gen == 10);
> > +
> > + /* Before changing the value of CACHE_MODE_0 register, GFX pipeline must
> > + * be idle; i.e., full flush is required.
> > + */
> > + brw_flush_gpu_caches(brw);
> > +
>
> I don't know how the flushing mechanism works completely, but I'm
> guessing that doing the following would be better:
>
> brw_emit_mi_flush();
> brw_emit_pipe_control_flush(brw, PIPE_CONTROL_CS_STALL |
> PIPE_CONTROL_STATE_CACHE_INVALIDATE);
>
> The first command seems to be the standard flush/invalidate/stall
> everything command. It doesn't invalidate the state cache however, hence
> the second command. Since the GPU must be idle, perhaps we want a CS
> stall as well?
>
>
Sorry for not reading the discussion on this earlier. Rafael pointed out
to me that what you have already does cause perform a CS stall.
-Nanley
> > + /* Write to CACHE_MODE_0 (0x7000) */
> > + BEGIN_BATCH(3);
> > + OUT_BATCH(MI_LOAD_REGISTER_IMM | (3 - 2));
> > + OUT_BATCH(GEN7_CACHE_MODE_0);
> > + OUT_BATCH(0);
> > + ADVANCE_BATCH();
>
> You could use brw_load_register_imm32() instead.
>
> > +}
> > +
> > /**
> > * Emits a PIPE_CONTROL with a non-zero post-sync operation, for
> > * implementing two workarounds on gen6. From section 1.4.7.1
> > diff --git a/src/mesa/drivers/dri/i965/gen8_multisample_state.c b/src/mesa/drivers/dri/i965/gen8_multisample_state.c
> > index 7a31a5df4a..14043025b6 100644
> > --- a/src/mesa/drivers/dri/i965/gen8_multisample_state.c
> > +++ b/src/mesa/drivers/dri/i965/gen8_multisample_state.c
> > @@ -49,6 +49,11 @@ gen8_emit_3dstate_multisample(struct brw_context *brw, unsigned num_samples)
> > void
> > gen8_emit_3dstate_sample_pattern(struct brw_context *brw)
> > {
> > + const struct gen_device_info *devinfo = &brw->screen->devinfo;
> > +
> > + if (devinfo->gen == 10)
> > + gen10_emit_wa_cs_stall_flush(brw);
> > +
> > BEGIN_BATCH(9);
> > OUT_BATCH(_3DSTATE_SAMPLE_PATTERN << 16 | (9 - 2));
> >
> > @@ -68,4 +73,7 @@ gen8_emit_3dstate_sample_pattern(struct brw_context *brw)
> > /* 1x and 2x MSAA */
> > OUT_BATCH(brw_multisample_positions_1x_2x);
> > ADVANCE_BATCH();
> > +
> > + if (devinfo->gen == 10)
> > + gen10_emit_wa_lri_to_cache_mode_zero(brw);
> > }
> > --
> > 2.13.5
> >
> > _______________________________________________
> > mesa-dev mailing list
> > mesa-dev at lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
More information about the mesa-stable
mailing list