[Mesa-dev] [PATCH v2] workarounds: Update workaround names and platforms

Ben Widawsky ben at bwidawsk.net
Tue Feb 9 01:40:54 UTC 2016


On Fri, Feb 05, 2016 at 01:59:23PM -0800, Sameer Kibey wrote:
> Update the format in which workarounds are documented
> in the source code. This allows mesa to be parsed
> by the list-workarounds utility in intel-gpu-tools.
> 
> Signed-off-by: Sameer Kibey <sameer.kibey at intel.com>
> ---
> changed byt to vlv for consistency.
>  src/mesa/drivers/dri/i965/brw_binding_tables.c | 2 +-
>  src/mesa/drivers/dri/i965/brw_blorp.cpp        | 2 ++
>  src/mesa/drivers/dri/i965/brw_defines.h        | 3 ++-
>  src/mesa/drivers/dri/i965/brw_eu_emit.c        | 3 ++-
>  src/mesa/drivers/dri/i965/brw_fs_generator.cpp | 9 ++++++---
>  src/mesa/drivers/dri/i965/brw_pipe_control.c   | 4 +++-
>  src/mesa/drivers/dri/i965/gen6_queryobj.c      | 2 +-
>  src/mesa/drivers/dri/i965/gen8_depth_state.c   | 3 ++-
>  src/mesa/drivers/dri/i965/intel_batchbuffer.c  | 2 +-
>  9 files changed, 20 insertions(+), 10 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_binding_tables.c b/src/mesa/drivers/dri/i965/brw_binding_tables.c
> index f3a0310..bcf6422 100644
> --- a/src/mesa/drivers/dri/i965/brw_binding_tables.c
> +++ b/src/mesa/drivers/dri/i965/brw_binding_tables.c
> @@ -54,7 +54,7 @@ static uint32_t
>  reserve_hw_bt_space(struct brw_context *brw, unsigned bytes)
>  {
>     /* From the Broadwell PRM, Volume 16, "Workarounds",
> -    * WaStateBindingTableOverfetch:
> +    * WaStateBindingTableOverfetch:hsw,bdw,chv,bxt
>      * "HW over-fetches two cache lines of binding table indices.  When
>      *  using the resource streamer, SW needs to pad binding table pointer
>      *  updates with an additional two cache lines."
> diff --git a/src/mesa/drivers/dri/i965/brw_blorp.cpp b/src/mesa/drivers/dri/i965/brw_blorp.cpp
> index 1bc6d15..dd01ea8 100644
> --- a/src/mesa/drivers/dri/i965/brw_blorp.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_blorp.cpp
> @@ -304,6 +304,8 @@ brw_hiz_op_params::brw_hiz_op_params(struct intel_mipmap_tree *mt,
>      *     aligned to an 8x4 pixel block relative to the upper left corner
>      *     of the depth buffer [...]
>      *
> +    * WaHizAmbiguate8x4Aligned:hsw
> +    *
>      * For hiz resolves, the rectangle must also be 8x4 aligned. Item
>      * WaHizAmbiguate8x4Aligned from the Haswell workarounds page and the
>      * Ivybridge simulator require the alignment.
> diff --git a/src/mesa/drivers/dri/i965/brw_defines.h b/src/mesa/drivers/dri/i965/brw_defines.h
> index 01e0c99..5410a1d 100644
> --- a/src/mesa/drivers/dri/i965/brw_defines.h
> +++ b/src/mesa/drivers/dri/i965/brw_defines.h
> @@ -1756,7 +1756,8 @@ enum brw_message_target {
>  /* Dataport special binding table indices: */
>  #define BRW_BTI_STATELESS                255
>  #define GEN7_BTI_SLM                     254
> -/* Note that on Gen8+ BTI 255 was redefined to be IA-coherent according to the
> +/* WaForceEnableNonCoherent:bdw,chv,skl,kbl
> + * Note that on Gen8+ BTI 255 was redefined to be IA-coherent according to the
>   * hardware spec, however because the DRM sets bit 4 of HDC_CHICKEN0 on BDW,
>   * CHV and at least some pre-production steppings of SKL due to
>   * WaForceEnableNonCoherent, HDC memory access may have been overridden by the
> diff --git a/src/mesa/drivers/dri/i965/brw_eu_emit.c b/src/mesa/drivers/dri/i965/brw_eu_emit.c
> index 35d8039..918d69e 100644
> --- a/src/mesa/drivers/dri/i965/brw_eu_emit.c
> +++ b/src/mesa/drivers/dri/i965/brw_eu_emit.c
> @@ -1885,7 +1885,8 @@ void brw_CMP(struct brw_codegen *p,
>     brw_set_src0(p, insn, src0);
>     brw_set_src1(p, insn, src1);
>  
> -   /* Item WaCMPInstNullDstForcesThreadSwitch in the Haswell Bspec workarounds
> +   /* WaCMPInstNullDstForcesThreadSwitch:ivb,hsw,vlv
> +    * Item WaCMPInstNullDstForcesThreadSwitch in the Haswell Bspec workarounds
>      * page says:
>      *    "Any CMP instruction with a null destination must use a {switch}."
>      *
> diff --git a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
> index 1916a99..24d4a9d 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
> @@ -1836,7 +1836,8 @@ fs_generator::generate_code(const cfg_t *cfg, int dispatch_width)
>           brw_F16TO32(p, dst, src[0]);
>           break;
>        case BRW_OPCODE_CMP:
> -         /* The Ivybridge/BayTrail WaCMPInstFlagDepClearedEarly workaround says
> +         /* WaCMPInstFlagDepClearedEarly:ivb,hsw,vlv
> +          * The Ivybridge/BayTrail WaCMPInstFlagDepClearedEarly workaround says
>            * that when the destination is a GRF that the dependency-clear bit on
>            * the flag register is cleared early.
>            *
> @@ -1928,7 +1929,8 @@ fs_generator::generate_code(const cfg_t *cfg, int dispatch_width)
>  
>        case BRW_OPCODE_BFI1:
>           assert(devinfo->gen >= 7);
> -         /* The Haswell WaForceSIMD8ForBFIInstruction workaround says that we
> +         /* WaForceSIMD8ForBFIInstruction:hsw
> +          * The Haswell WaForceSIMD8ForBFIInstruction workaround says that we
>            * should
>            *
>            *    "Force BFI instructions to be executed always in SIMD8."
> @@ -1947,7 +1949,8 @@ fs_generator::generate_code(const cfg_t *cfg, int dispatch_width)
>        case BRW_OPCODE_BFI2:
>           assert(devinfo->gen >= 7);
>           brw_set_default_access_mode(p, BRW_ALIGN_16);
> -         /* The Haswell WaForceSIMD8ForBFIInstruction workaround says that we
> +         /* WaForceSIMD8ForBFIInstruction:hsw
> +          * The Haswell WaForceSIMD8ForBFIInstruction workaround says that we
>            * should
>            *
>            *    "Force BFI instructions to be executed always in SIMD8."
> diff --git a/src/mesa/drivers/dri/i965/brw_pipe_control.c b/src/mesa/drivers/dri/i965/brw_pipe_control.c
> index 6c636d2..822473c 100644
> --- a/src/mesa/drivers/dri/i965/brw_pipe_control.c
> +++ b/src/mesa/drivers/dri/i965/brw_pipe_control.c
> @@ -60,7 +60,9 @@ gen8_add_cs_stall_workaround_bits(uint32_t *flags)
>        *flags |= PIPE_CONTROL_STALL_AT_SCOREBOARD;
>  }
>  
> -/* Implement the WaCsStallAtEveryFourthPipecontrol workaround on IVB, BYT:
> +/* WaCsStallAtEveryFourthPipecontrol:ivb,vlv
> + *
> + * Implement the WaCsStallAtEveryFourthPipecontrol workaround on IVB, BYT:
>   *
>   * "Every 4th PIPE_CONTROL command, not counting the PIPE_CONTROL with
>   *  only read-cache-invalidate bit(s) set, must have a CS_STALL bit set."
> diff --git a/src/mesa/drivers/dri/i965/gen6_queryobj.c b/src/mesa/drivers/dri/i965/gen6_queryobj.c
> index d508c4c..9171df0 100644
> --- a/src/mesa/drivers/dri/i965/gen6_queryobj.c
> +++ b/src/mesa/drivers/dri/i965/gen6_queryobj.c
> @@ -238,7 +238,7 @@ gen6_queryobj_get_results(struct gl_context *ctx,
>  
>     case GL_FRAGMENT_SHADER_INVOCATIONS_ARB:
>        query->Base.Result = (results[1] - results[0]);
> -      /* Implement the "WaDividePSInvocationCountBy4:HSW,BDW" workaround:
> +      /* Implement the WaDividePSInvocationCountBy4:hsw,bdw,chv workaround:
>         * "Invocation counter is 4 times actual.  WA: SW to divide HW reported
>         *  PS Invocations value by 4."
>         *

As an example of rewording like I mention below, you could entirely drop the
"Implement..." sentence, and just keep the actual wa prose. Then slap the label
at the bottom.

> diff --git a/src/mesa/drivers/dri/i965/gen8_depth_state.c b/src/mesa/drivers/dri/i965/gen8_depth_state.c
> index 93100a0..c5ac7e4 100644
> --- a/src/mesa/drivers/dri/i965/gen8_depth_state.c
> +++ b/src/mesa/drivers/dri/i965/gen8_depth_state.c
> @@ -193,7 +193,8 @@ gen8_emit_depth_stencil_hiz(struct brw_context *brw,
>     case GL_TEXTURE_1D_ARRAY:
>     case GL_TEXTURE_1D:
>        if (brw->gen >= 9) {
> -         /* WaDisable1DDepthStencil. Skylake+ doesn't support 1D depth
> +         /* WaDisable1DDepthStencil:skl,bxt,kbl
> +          * Skylake+ doesn't support 1D depth
>            * textures but it does allow pretending it's a 2D texture
>            * instead.
>            */

Please rewrap this comment.

> diff --git a/src/mesa/drivers/dri/i965/intel_batchbuffer.c b/src/mesa/drivers/dri/i965/intel_batchbuffer.c
> index f778074..6fc5ebd 100644
> --- a/src/mesa/drivers/dri/i965/intel_batchbuffer.c
> +++ b/src/mesa/drivers/dri/i965/intel_batchbuffer.c
> @@ -228,7 +228,7 @@ brw_finish_batch(struct brw_context *brw)
>            * From the example in the docs, it seems to expect a regular pipe control
>            * flush here as well. We may have done it already, but meh.
>            *
> -          * See also WaAvoidRCZCounterRollover.
> +          * See also WaAvoidRCZCounterRollover:hsw

Just get rid of "See also"

>            */
>           brw_emit_mi_flush(brw);
>           BEGIN_BATCH(2);

Okay, I had a chat with Ken to make sure he's on board with this, since he's
been one of the primary developers who looks over it. The change itself is
somewhat trivial, but I wanted to have the conversation about whether this will
actually be useful. I think we're of similar minds in that we hope it's useful,
and it doesn't make much new clutter (and hopefully it doesn't get stale).

One thing they do in the kernel, which is like is that have the workaround label
listed last in the commit message (as an example, you can see
intel_fbc_work_fn()). This makes it easy to spot without the tool. I think it's
weird to have the workaround label with platforms in the prose of a commit
message.  It's fine if you do some rewording of the comments to make things work
a little better with this scheme, but that's up to you.

/* blahblahblah
 * workarounds suck
 * and we hate to do them.
 *
 * WaWTFMate:ilk,snb
 */

I can modify this for you on merge, if you don't want to resubmit. It's up to
you. Just let me know.

-- 
Ben Widawsky, Intel Open Source Technology Center


More information about the mesa-dev mailing list