[Mesa-stable] [PATCH V2 1/4] i965/gen10: Implement WaSampleOffsetIZ workaround

Rafael Antognolli rafael.antognolli at intel.com
Mon Oct 30 17:39:14 UTC 2017


On Mon, Oct 23, 2017 at 08:46:26AM -0700, Anuj Phogat wrote:
> Ping. Patches 2-4 in this series are still waiting for review.
> Anyone interested?
> Thanks!
> 
> 
> 
> On Fri, Oct 13, 2017 at 3:35 PM, Rafael Antognolli
> <rafael.antognolli at intel.com> wrote:
> > Hi Anuj, sorry that I missed this patch. Please see below.
> >
> > On Fri, Oct 06, 2017 at 04:30:47PM -0700, Anuj Phogat wrote:
> >> There are few other (duplicate) workarounds which have similar recommendations:
> >> WaFlushHangWhenNonPipelineStateAndMarkerStalled
> >> WaCSStallBefore3DSamplePattern
> >> WaPipeControlBefore3DStateSamplePattern
> >>
> >> WaPipeControlBefore3DStateSamplePattern has some extra recommendations if
> >> driver is using mid batch context restore. Ignoring it for now because We're
> >> not doing mid-batch context restore in Mesa.
> >>
> >> Cc: mesa-stable at lists.freedesktop.org
> >> Cc: Jason Ekstrand <jason at jlekstrand.net>
> >> Cc: Rafael Antognolli <rafael.antognolli at intel.com>
> >> Signed-off-by: Anuj Phogat <anuj.phogat at gmail.com>
> >> ---
> >>  src/mesa/drivers/dri/i965/brw_context.h            |  2 +
> >>  src/mesa/drivers/dri/i965/brw_defines.h            |  1 +
> >>  src/mesa/drivers/dri/i965/brw_pipe_control.c       | 50 ++++++++++++++++++++++
> >>  src/mesa/drivers/dri/i965/gen8_multisample_state.c |  8 ++++
> >>  4 files changed, 61 insertions(+)
> >>
> >> diff --git a/src/mesa/drivers/dri/i965/brw_context.h b/src/mesa/drivers/dri/i965/brw_context.h
> >> index 92fc16de13..f0e8d562e9 100644
> >> --- a/src/mesa/drivers/dri/i965/brw_context.h
> >> +++ b/src/mesa/drivers/dri/i965/brw_context.h
> >> @@ -1647,6 +1647,8 @@ void brw_emit_post_sync_nonzero_flush(struct brw_context *brw);
> >>  void brw_emit_depth_stall_flushes(struct brw_context *brw);
> >>  void gen7_emit_vs_workaround_flush(struct brw_context *brw);
> >>  void gen7_emit_cs_stall_flush(struct brw_context *brw);
> >> +void gen10_emit_wa_cs_stall_flush(struct brw_context *brw);
> >> +void gen10_emit_wa_lri_to_cache_mode_zero(struct brw_context *brw);
> >>
> >>  /* brw_queryformat.c */
> >>  void brw_query_internal_format(struct gl_context *ctx, GLenum target,
> >> diff --git a/src/mesa/drivers/dri/i965/brw_defines.h b/src/mesa/drivers/dri/i965/brw_defines.h
> >> index 4abb790612..270cdf29db 100644
> >> --- a/src/mesa/drivers/dri/i965/brw_defines.h
> >> +++ b/src/mesa/drivers/dri/i965/brw_defines.h
> >> @@ -1609,6 +1609,7 @@ enum brw_pixel_shader_coverage_mask_mode {
> >>  #define GEN7_GPGPU_DISPATCHDIMY         0x2504
> >>  #define GEN7_GPGPU_DISPATCHDIMZ         0x2508
> >>
> >> +#define GEN7_CACHE_MODE_0               0x7000
> >>  #define GEN7_CACHE_MODE_1               0x7004
> >>  # define GEN9_FLOAT_BLEND_OPTIMIZATION_ENABLE (1 << 4)
> >>  # define GEN8_HIZ_NP_PMA_FIX_ENABLE        (1 << 11)
> >> diff --git a/src/mesa/drivers/dri/i965/brw_pipe_control.c b/src/mesa/drivers/dri/i965/brw_pipe_control.c
> >> index 460b8f73b6..156f5c25ec 100644
> >> --- a/src/mesa/drivers/dri/i965/brw_pipe_control.c
> >> +++ b/src/mesa/drivers/dri/i965/brw_pipe_control.c
> >> @@ -278,6 +278,56 @@ gen7_emit_cs_stall_flush(struct brw_context *brw)
> >>                                 brw->workaround_bo, 0, 0);
> >>  }
> >>
> >> +static void
> >> +brw_flush_gpu_caches(struct brw_context *brw) {
> >> +   brw_emit_pipe_control_flush(brw,
> >> +                               PIPE_CONTROL_CACHE_FLUSH_BITS |
> >> +                               PIPE_CONTROL_CACHE_INVALIDATE_BITS);
> >> +}
> >
> > This function is only calling another function without any extra logic, so I
> > would just call brw_emit_pipe_control_flush() and remove this declaration. But
> > that's just cosmetic.
> >
> > With or without this change, this patch correctly implements the workaround
> > imho, so it is
> >
> > Reviewed-by: Rafael Antognolli <rafael.antognolli at intel.com>
> >
> >> +/**
> >> + * From Gen10 Workarounds page in h/w specs:
> >> + * WaSampleOffsetIZ:
> >> + * Prior to the 3DSTATE_SAMPLE_PATTERN driver must ensure there are no
> >> + * markers in the pipeline by programming a PIPE_CONTROL with stall.
> >> + */
> >> +void
> >> +gen10_emit_wa_cs_stall_flush(struct brw_context *brw)
> >> +{
> >> +   const struct gen_device_info *devinfo = &brw->screen->devinfo;
> >> +   assert(devinfo->gen == 10);
> >> +   brw_emit_pipe_control_flush(brw,
> >> +                               PIPE_CONTROL_CS_STALL |
> >> +                               PIPE_CONTROL_STALL_AT_SCOREBOARD);
> >> +}
> >> +
> >> +/**
> >> + * From Gen10 Workarounds page in h/w specs:
> >> + * WaSampleOffsetIZ:
> >> + * When 3DSTATE_SAMPLE_PATTERN is programmed, driver must then issue an
> >> + * MI_LOAD_REGISTER_IMM command to an offset between 0x7000 and 0x7FFF(SVL)
> >> + * after the command to ensure the state has been delivered prior to any
> >> + * command causing a marker in the pipeline.
> >> + */
> >> +void
> >> +gen10_emit_wa_lri_to_cache_mode_zero(struct brw_context *brw)
> >> +{
> >> +   const struct gen_device_info *devinfo = &brw->screen->devinfo;
> >> +   assert(devinfo->gen == 10);
> >> +
> >> +   /* Before changing the value of CACHE_MODE_0 register, GFX pipeline must
> >> +    * be idle; i.e., full flush is required.
> >> +    */
> >> +   brw_flush_gpu_caches(brw);

I'm still not fully sure but I think this flush should come after the
MI_LOAD_REGISTER_IMM. But I can't prove this yet, so we could change the
order now or later once we confirm that.

Regardless of that, this patch it

Reviewed-by: Rafael Antognolli <rafael.antognolli at intel.com>

> >> +   /* Write to CACHE_MODE_0 (0x7000) */
> >> +   BEGIN_BATCH(3);
> >> +   OUT_BATCH(MI_LOAD_REGISTER_IMM | (3 - 2));
> >> +   OUT_BATCH(GEN7_CACHE_MODE_0);
> >> +   OUT_BATCH(0);
> >> +   ADVANCE_BATCH();
> >> +}
> >> +
> >>  /**
> >>   * Emits a PIPE_CONTROL with a non-zero post-sync operation, for
> >>   * implementing two workarounds on gen6.  From section 1.4.7.1
> >> diff --git a/src/mesa/drivers/dri/i965/gen8_multisample_state.c b/src/mesa/drivers/dri/i965/gen8_multisample_state.c
> >> index 7a31a5df4a..14043025b6 100644
> >> --- a/src/mesa/drivers/dri/i965/gen8_multisample_state.c
> >> +++ b/src/mesa/drivers/dri/i965/gen8_multisample_state.c
> >> @@ -49,6 +49,11 @@ gen8_emit_3dstate_multisample(struct brw_context *brw, unsigned num_samples)
> >>  void
> >>  gen8_emit_3dstate_sample_pattern(struct brw_context *brw)
> >>  {
> >> +   const struct gen_device_info *devinfo = &brw->screen->devinfo;
> >> +
> >> +   if (devinfo->gen == 10)
> >> +      gen10_emit_wa_cs_stall_flush(brw);
> >> +
> >>     BEGIN_BATCH(9);
> >>     OUT_BATCH(_3DSTATE_SAMPLE_PATTERN << 16 | (9 - 2));
> >>
> >> @@ -68,4 +73,7 @@ gen8_emit_3dstate_sample_pattern(struct brw_context *brw)
> >>     /* 1x and 2x MSAA */
> >>     OUT_BATCH(brw_multisample_positions_1x_2x);
> >>     ADVANCE_BATCH();
> >> +
> >> +   if (devinfo->gen == 10)
> >> +      gen10_emit_wa_lri_to_cache_mode_zero(brw);
> >>  }
> >> --
> >> 2.13.5
> >>


More information about the mesa-stable mailing list