[Mesa-stable] [Mesa-dev] [PATCH V2 1/4] i965/gen10: Implement WaSampleOffsetIZ workaround

Nanley Chery nanleychery at gmail.com
Tue Oct 31 20:17:48 UTC 2017


On Mon, Oct 30, 2017 at 05:24:10PM -0700, Nanley Chery wrote:
> On Fri, Oct 06, 2017 at 04:30:47PM -0700, Anuj Phogat wrote:
> > There are few other (duplicate) workarounds which have similar recommendations:
> > WaFlushHangWhenNonPipelineStateAndMarkerStalled
> > WaCSStallBefore3DSamplePattern
> > WaPipeControlBefore3DStateSamplePattern
> > 
> > WaPipeControlBefore3DStateSamplePattern has some extra recommendations if
> > driver is using mid batch context restore. Ignoring it for now because We're
> > not doing mid-batch context restore in Mesa.
> > 
> 
> How do we know we've implemented this correctly? Does this fix or
> improve any hangs we've been seeing? It would be helpful to have this
> information in the commit message.
> 
> > Cc: mesa-stable at lists.freedesktop.org
> > Cc: Jason Ekstrand <jason at jlekstrand.net>
> > Cc: Rafael Antognolli <rafael.antognolli at intel.com>
> > Signed-off-by: Anuj Phogat <anuj.phogat at gmail.com>
> > ---
> >  src/mesa/drivers/dri/i965/brw_context.h            |  2 +
> >  src/mesa/drivers/dri/i965/brw_defines.h            |  1 +
> >  src/mesa/drivers/dri/i965/brw_pipe_control.c       | 50 ++++++++++++++++++++++
> >  src/mesa/drivers/dri/i965/gen8_multisample_state.c |  8 ++++
> >  4 files changed, 61 insertions(+)
> > 
> > diff --git a/src/mesa/drivers/dri/i965/brw_context.h b/src/mesa/drivers/dri/i965/brw_context.h
> > index 92fc16de13..f0e8d562e9 100644
> > --- a/src/mesa/drivers/dri/i965/brw_context.h
> > +++ b/src/mesa/drivers/dri/i965/brw_context.h
> > @@ -1647,6 +1647,8 @@ void brw_emit_post_sync_nonzero_flush(struct brw_context *brw);
> >  void brw_emit_depth_stall_flushes(struct brw_context *brw);
> >  void gen7_emit_vs_workaround_flush(struct brw_context *brw);
> >  void gen7_emit_cs_stall_flush(struct brw_context *brw);
> > +void gen10_emit_wa_cs_stall_flush(struct brw_context *brw);
> > +void gen10_emit_wa_lri_to_cache_mode_zero(struct brw_context *brw);
> >  
> >  /* brw_queryformat.c */
> >  void brw_query_internal_format(struct gl_context *ctx, GLenum target,
> > diff --git a/src/mesa/drivers/dri/i965/brw_defines.h b/src/mesa/drivers/dri/i965/brw_defines.h
> > index 4abb790612..270cdf29db 100644
> > --- a/src/mesa/drivers/dri/i965/brw_defines.h
> > +++ b/src/mesa/drivers/dri/i965/brw_defines.h
> > @@ -1609,6 +1609,7 @@ enum brw_pixel_shader_coverage_mask_mode {
> >  #define GEN7_GPGPU_DISPATCHDIMY         0x2504
> >  #define GEN7_GPGPU_DISPATCHDIMZ         0x2508
> >  
> > +#define GEN7_CACHE_MODE_0               0x7000
> >  #define GEN7_CACHE_MODE_1               0x7004
> 
> >  # define GEN9_FLOAT_BLEND_OPTIMIZATION_ENABLE (1 << 4)
> >  # define GEN8_HIZ_NP_PMA_FIX_ENABLE        (1 << 11)
> > diff --git a/src/mesa/drivers/dri/i965/brw_pipe_control.c b/src/mesa/drivers/dri/i965/brw_pipe_control.c
> > index 460b8f73b6..156f5c25ec 100644
> > --- a/src/mesa/drivers/dri/i965/brw_pipe_control.c
> > +++ b/src/mesa/drivers/dri/i965/brw_pipe_control.c
> > @@ -278,6 +278,56 @@ gen7_emit_cs_stall_flush(struct brw_context *brw)
> >                                 brw->workaround_bo, 0, 0);
> >  }
> >  
> > +static void
> > +brw_flush_gpu_caches(struct brw_context *brw) {
> > +   brw_emit_pipe_control_flush(brw,
> > +                               PIPE_CONTROL_CACHE_FLUSH_BITS |
> > +                               PIPE_CONTROL_CACHE_INVALIDATE_BITS);
> > +}
> > +
> > +/**
> > + * From Gen10 Workarounds page in h/w specs:
> > + * WaSampleOffsetIZ:
> > + * Prior to the 3DSTATE_SAMPLE_PATTERN driver must ensure there are no
> > + * markers in the pipeline by programming a PIPE_CONTROL with stall.
> > + */
> > +void
> > +gen10_emit_wa_cs_stall_flush(struct brw_context *brw)
> > +{
> > +   const struct gen_device_info *devinfo = &brw->screen->devinfo;
> > +   assert(devinfo->gen == 10);
> > +   brw_emit_pipe_control_flush(brw,
> > +                               PIPE_CONTROL_CS_STALL |
> > +                               PIPE_CONTROL_STALL_AT_SCOREBOARD);
> > +}
> > +
> > +/**
> > + * From Gen10 Workarounds page in h/w specs:
> > + * WaSampleOffsetIZ:
> > + * When 3DSTATE_SAMPLE_PATTERN is programmed, driver must then issue an
> > + * MI_LOAD_REGISTER_IMM command to an offset between 0x7000 and 0x7FFF(SVL)
> > + * after the command to ensure the state has been delivered prior to any
> > + * command causing a marker in the pipeline.
> > + */
> > +void
> > +gen10_emit_wa_lri_to_cache_mode_zero(struct brw_context *brw)
> > +{
> > +   const struct gen_device_info *devinfo = &brw->screen->devinfo;
> > +   assert(devinfo->gen == 10);
> > +
> > +   /* Before changing the value of CACHE_MODE_0 register, GFX pipeline must
> > +    * be idle; i.e., full flush is required.
> > +    */
> > +   brw_flush_gpu_caches(brw);
> > +
> 
> I don't know how the flushing mechanism works completely, but I'm
> guessing that doing the following would be better:
> 
>    brw_emit_mi_flush();
>    brw_emit_pipe_control_flush(brw, PIPE_CONTROL_CS_STALL |
>                                PIPE_CONTROL_STATE_CACHE_INVALIDATE);
> 
> The first command seems to be the standard flush/invalidate/stall
> everything command. It doesn't invalidate the state cache however, hence
> the second command. Since the GPU must be idle, perhaps we want a CS
> stall as well?
> 
> 

Sorry for not reading the discussion on this earlier. Rafael pointed out
to me that what you have already does cause perform a CS stall.

-Nanley

> > +   /* Write to CACHE_MODE_0 (0x7000) */
> > +   BEGIN_BATCH(3);
> > +   OUT_BATCH(MI_LOAD_REGISTER_IMM | (3 - 2));
> > +   OUT_BATCH(GEN7_CACHE_MODE_0);
> > +   OUT_BATCH(0);
> > +   ADVANCE_BATCH();
> 
> You could use brw_load_register_imm32() instead.
> 
> > +}
> > +
> >  /**
> >   * Emits a PIPE_CONTROL with a non-zero post-sync operation, for
> >   * implementing two workarounds on gen6.  From section 1.4.7.1
> > diff --git a/src/mesa/drivers/dri/i965/gen8_multisample_state.c b/src/mesa/drivers/dri/i965/gen8_multisample_state.c
> > index 7a31a5df4a..14043025b6 100644
> > --- a/src/mesa/drivers/dri/i965/gen8_multisample_state.c
> > +++ b/src/mesa/drivers/dri/i965/gen8_multisample_state.c
> > @@ -49,6 +49,11 @@ gen8_emit_3dstate_multisample(struct brw_context *brw, unsigned num_samples)
> >  void
> >  gen8_emit_3dstate_sample_pattern(struct brw_context *brw)
> >  {
> > +   const struct gen_device_info *devinfo = &brw->screen->devinfo;
> > +
> > +   if (devinfo->gen == 10)
> > +      gen10_emit_wa_cs_stall_flush(brw);
> > +
> >     BEGIN_BATCH(9);
> >     OUT_BATCH(_3DSTATE_SAMPLE_PATTERN << 16 | (9 - 2));
> >  
> > @@ -68,4 +73,7 @@ gen8_emit_3dstate_sample_pattern(struct brw_context *brw)
> >     /* 1x and 2x MSAA */
> >     OUT_BATCH(brw_multisample_positions_1x_2x);
> >     ADVANCE_BATCH();
> > +
> > +   if (devinfo->gen == 10)
> > +      gen10_emit_wa_lri_to_cache_mode_zero(brw);
> >  }
> > -- 
> > 2.13.5
> > 
> > _______________________________________________
> > mesa-dev mailing list
> > mesa-dev at lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/mesa-dev


More information about the mesa-stable mailing list