[Mesa-dev] [PATCH 4/6] i965/fs: Optimize IF/MOV/ELSE/MOV/ENDIF to SEL when possible.

Ian Romanick idr at freedesktop.org
Wed Aug 7 15:45:05 PDT 2013


On 08/05/2013 06:28 PM, Kenneth Graunke wrote:
> Many GLSL shaders contain code of the form:
>
>     x = condition ? foo : bar
>
> The compiler emits an ir_if tree for this, since each subexpression
> might be a complex tree that could have side-effects and short-circuit
> logic operations.
>
> However, the common case is to simply pick one of two constants or
> variable's values---which is exactly what SEL is for.  Replacing IF/ELSE
> with SEL also simplifies the control flow graph, making optimization
> passes which work on basic blocks more effective.
>
> The shader-db statistics:
>
>     total instructions in shared programs: 1655247 -> 1503234 (-9.18%)
>     instructions in affected programs:     949188 -> 797175 (-16.02%)
>
>     2,970 shaders were helped, none hurt.  Gained 181 SIMD16 programs.
>
> This helps Valve's Source Engine games (max -41.33%), The Cave
> (max -33.33%), Serious Sam 3 (max -18.64%), Yo Frankie! (max -30.19%),
> Zen Bound (max -22.22%), GStreamer (max -6.12%), and GLBenchmark 2.7
> (max -1.94%).
>
> Signed-off-by: Kenneth Graunke <kenneth at whitecape.org>
> ---
>   src/mesa/drivers/dri/i965/brw_fs.h           |  1 +
>   src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 78 ++++++++++++++++++++++++++++
>   2 files changed, 79 insertions(+)
>
> The pattern matching stuff here might be useful to abstract for reuse in
> other peephole type optimizations; ensuring that the right opcodes exist
> without accidentally walking the list is tricky to get right.
>
> Then again, I'm not sure how many useful peephole optimizations we'll have;
> it may be more useful in many cases to walk a UD-chain rather than looking
> at consecutive instructions.
>
> diff --git a/src/mesa/drivers/dri/i965/brw_fs.h b/src/mesa/drivers/dri/i965/brw_fs.h
> index 370ab6c..7feb2b6 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs.h
> +++ b/src/mesa/drivers/dri/i965/brw_fs.h
> @@ -369,6 +369,7 @@ public:
>                       fs_reg src0, fs_reg src1);
>      bool try_emit_saturate(ir_expression *ir);
>      bool try_emit_mad(ir_expression *ir, int mul_arg);
> +   void try_replace_with_sel();
>      void emit_bool_to_cond_code(ir_rvalue *condition);
>      void emit_if_gen6(ir_if *ir);
>      void emit_unspill(fs_inst *inst, fs_reg reg, uint32_t spill_offset);
> diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
> index ee7728c..a36c248 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
> @@ -1842,6 +1842,82 @@ fs_visitor::emit_if_gen6(ir_if *ir)
>      inst->predicate = BRW_PREDICATE_NORMAL;
>   }
>
> +/**
> + * Try to replace IF/MOV/ELSE/MOV/ENDIF with SEL.
> + *
> + * Many GLSL shaders contain the following pattern:
> + *
> + *    x = condition ? foo : bar
> + *
> + * The compiler emits an ir_if tree for this, since each subexpression might be
> + * a complex tree that could have side-effects or short-circuit logic.
> + *
> + * However, the common case is to simply select one of two constants or
> + * variable values---which is exactly what SEL is for.  In this case, the
> + * assembly looks like:
> + *
> + *    (+f0) IF
> + *    MOV dst src0
> + *    ELSE
> + *    MOV dst src1
> + *    ENDIF

Do we see many cases of

     foo = batman;
     if (condition)
         foo = robin;

> + *
> + * which can be easily translated into:
> + *
> + *    (+f0) SEL dst src0 src1
> + *
> + * If src0 is an immediate value, we promote it to a temporary GRF.
> + */
> +void
> +fs_visitor::try_replace_with_sel()
> +{
> +   fs_inst *endif_inst = (fs_inst *) instructions.get_tail();
> +   assert(endif_inst->opcode == BRW_OPCODE_ENDIF);
> +
> +   /* Pattern match in reverse: IF, MOV, ELSE, MOV, ENDIF. */

Just curious about the decision to match in reverse...

> +   int opcodes[] = {
> +      BRW_OPCODE_IF, BRW_OPCODE_MOV, BRW_OPCODE_ELSE, BRW_OPCODE_MOV,
> +   };
> +
> +   fs_inst *match = (fs_inst *) endif_inst->prev;
> +   for (int i = 0; i < 4; i++) {
> +      if (match->is_head_sentinel() || match->opcode != opcodes[4-i-1])
> +         return;
> +      match = (fs_inst *) match->prev;
> +   }
> +
> +   /* The opcodes match; it looks like the right sequence of instructions. */
> +   fs_inst *else_mov = (fs_inst *) endif_inst->prev;
> +   fs_inst *then_mov = (fs_inst *) else_mov->prev->prev;
> +   fs_inst *if_inst = (fs_inst *) then_mov->prev;
> +
> +   /* Check that the MOVs are the right form. */
> +   if (then_mov->dst.equals(else_mov->dst) &&
> +       !then_mov->is_partial_write() &&
> +       !else_mov->is_partial_write()) {
> +
> +      /* Remove the matched instructions; we'll emit a SEL to replace them. */
> +      while (!if_inst->next->is_tail_sentinel())
> +         if_inst->next->remove();
> +      if_inst->remove();
> +
> +      /* Only the last source register can be a constant, so if the MOV in
> +       * the "then" clause uses a constant, we need to put it in a temporary.
> +       */
> +      fs_reg src0(then_mov->src[0]);
> +      if (src0.file == IMM) {
> +         src0 = fs_reg(this, glsl_type::float_type);
> +         src0.type = then_mov->src[0].type;
> +         emit(MOV(src0, then_mov->src[0]));
> +      }
> +
> +      fs_inst *sel = emit(BRW_OPCODE_SEL, then_mov->dst, src0, else_mov->src[0]);
> +      sel->predicate = if_inst->predicate;
> +      sel->predicate_inverse = if_inst->predicate_inverse;
> +      sel->conditional_mod = if_inst->conditional_mod;
> +   }
> +}
> +
>   void
>   fs_visitor::visit(ir_if *ir)
>   {
> @@ -1881,6 +1957,8 @@ fs_visitor::visit(ir_if *ir)
>      }
>
>      emit(BRW_OPCODE_ENDIF);
> +
> +   try_replace_with_sel();
>   }
>
>   void
>



More information about the mesa-dev mailing list