[Mesa-dev] [PATCH 5/5] i965/fs: Avoid generating extra AND instructions on bool logic ops.

Kenneth Graunke kenneth at whitecape.org
Wed Apr 11 17:42:39 PDT 2012


On 03/26/2012 01:59 PM, Eric Anholt wrote:
> By making a bool fs_reg only have a defined low bit (matching CMP
> output), instead of being a full 0 or 1 value, we reduce the ANDs
> generated in logic chains like:

Ohhhhhh.  I finally figured out what's going on here.

Prior to this patch, in the i965 driver, we define bool values as full 
32-bit 0 or 1 values.  That is:

     true  = 00000000000000000000000000000001
     false = 00000000000000000000000000000000

This patch relaxes that so that only the low bit is defined...or, in 
other words, the top 31 bits can be *anything*:

     true  = xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx1
     false = xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx0

So, when we convert from a bool to an int, we need to AND with 1 to get 
rid of all the random values in the top-31 bits to get a proper 0 or 1.

And the logic_xor stuff looks like it's just a clean-up, since those now 
become the same code as the default case.  Still not sure what's going 
on with the generation checks.

Sorry for being so dense.  I'll give this one an:
Acked-by: Kenneth Graunke <kenneth at whitecape.org>

One comment below.

>
>     if (v_texcoord.x<  0.0 || v_texcoord.x>  texwidth ||
>         v_texcoord.y<  0.0 || v_texcoord.y>  1.0)
>        discard;
>
> My concern originally when writing this code was that we would end up
> generating unnecessary ANDs on bool uniforms, so I put the ANDs right
> at the point of doing the CMPs that otherwise set only the low bit.
> However, in order to use a bool, we're generating some instruction
> anyway (e.g. moving it so as to produce a condition code update), and
> those instructions can often be turned into an AND at that point.  It
> turns out in the shaders I have on hand, none of them regress in
> instruction count:
>
> Total instructions: 262649 ->  262545
> 39/2148 programs affected (1.8%)
> 14253 ->  14149 instructions in affected programs (0.7% reduction)
> ---
>   src/mesa/drivers/dri/i965/brw_fs_visitor.cpp |   36 ++++++++++----------------
>   1 files changed, 14 insertions(+), 22 deletions(-)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
> index 3460d14..6315957 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
> @@ -394,7 +394,6 @@ fs_visitor::visit(ir_expression *ir)
>
>         inst = emit(BRW_OPCODE_CMP, temp, op[0], op[1]);
>         inst->conditional_mod = brw_conditional_for_comparison(ir->operation);
> -      emit(BRW_OPCODE_AND, this->result, this->result, fs_reg(0x1));
>         break;
>
>      case ir_binop_logic_xor:
> @@ -440,11 +439,19 @@ fs_visitor::visit(ir_expression *ir)
>         break;
>      case ir_unop_i2f:
>      case ir_unop_u2f:
> -   case ir_unop_b2f:
> -   case ir_unop_b2i:
>      case ir_unop_f2i:
>         emit(BRW_OPCODE_MOV, this->result, op[0]);
>         break;
> +
> +   case ir_unop_b2i:
> +      inst = emit(BRW_OPCODE_AND, this->result, op[0], fs_reg(1));
> +      break;
> +   case ir_unop_b2f:
> +      temp = fs_reg(this, glsl_type::int_type);
> +      emit(BRW_OPCODE_AND, temp, op[0], fs_reg(1));
> +      emit(BRW_OPCODE_MOV, this->result, temp);

If my memory serves me...you should be able to just do this in one 
instruction: AND dst<F> op0<D> 1D.  I know you can't mix float/int 
_source_ types(*), but I thought the destination could be a separate type.

(*) ISA Reference/Execution Environment/Registers and Register 
Regions/Execution Data Type

> +      break;
> +
>      case ir_unop_f2b:
>      case ir_unop_i2b:
>         temp = this->result;
> @@ -456,7 +463,6 @@ fs_visitor::visit(ir_expression *ir)
>
>         inst = emit(BRW_OPCODE_CMP, temp, op[0], fs_reg(0.0f));
>         inst->conditional_mod = BRW_CONDITIONAL_NZ;
> -      inst = emit(BRW_OPCODE_AND, this->result, this->result, fs_reg(1));
>         break;
>
>      case ir_unop_trunc:
> @@ -1488,19 +1494,9 @@ fs_visitor::emit_bool_to_cond_code(ir_rvalue *ir)
>   	 break;
>
>         case ir_binop_logic_xor:
> -	 inst = emit(BRW_OPCODE_XOR, reg_null_d, op[0], op[1]);
> -	 inst->conditional_mod = BRW_CONDITIONAL_NZ;
> -	 break;
> -
>         case ir_binop_logic_or:
> -	 inst = emit(BRW_OPCODE_OR, reg_null_d, op[0], op[1]);
> -	 inst->conditional_mod = BRW_CONDITIONAL_NZ;
> -	 break;
> -
>         case ir_binop_logic_and:
> -	 inst = emit(BRW_OPCODE_AND, reg_null_d, op[0], op[1]);
> -	 inst->conditional_mod = BRW_CONDITIONAL_NZ;
> -	 break;
> +	 goto out;
>
>         case ir_unop_f2b:
>   	 if (intel->gen>= 6) {
> @@ -1541,15 +1537,11 @@ fs_visitor::emit_bool_to_cond_code(ir_rvalue *ir)
>         return;
>      }
>
> +out:
>      ir->accept(this);
>
> -   if (intel->gen>= 6) {
> -      fs_inst *inst = emit(BRW_OPCODE_AND, reg_null_d, this->result, fs_reg(1));
> -      inst->conditional_mod = BRW_CONDITIONAL_NZ;
> -   } else {
> -      fs_inst *inst = emit(BRW_OPCODE_MOV, reg_null_d, this->result);
> -      inst->conditional_mod = BRW_CONDITIONAL_NZ;
> -   }
> +   fs_inst *inst = emit(BRW_OPCODE_AND, reg_null_d, this->result, fs_reg(1));
> +   inst->conditional_mod = BRW_CONDITIONAL_NZ;
>   }
>
>   /**


More information about the mesa-dev mailing list