[Mesa-dev] [PATCH 5/5] i965/fs: Avoid generating extra AND instructions on bool logic ops.
Kenneth Graunke
kenneth at whitecape.org
Wed Apr 11 17:42:39 PDT 2012
On 03/26/2012 01:59 PM, Eric Anholt wrote:
> By making a bool fs_reg only have a defined low bit (matching CMP
> output), instead of being a full 0 or 1 value, we reduce the ANDs
> generated in logic chains like:
Ohhhhhh. I finally figured out what's going on here.
Prior to this patch, in the i965 driver, we define bool values as full
32-bit 0 or 1 values. That is:
true = 00000000000000000000000000000001
false = 00000000000000000000000000000000
This patch relaxes that so that only the low bit is defined...or, in
other words, the top 31 bits can be *anything*:
true = xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx1
false = xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx0
So, when we convert from a bool to an int, we need to AND with 1 to get
rid of all the random values in the top-31 bits to get a proper 0 or 1.
And the logic_xor stuff looks like it's just a clean-up, since those now
become the same code as the default case. Still not sure what's going
on with the generation checks.
Sorry for being so dense. I'll give this one an:
Acked-by: Kenneth Graunke <kenneth at whitecape.org>
One comment below.
>
> if (v_texcoord.x< 0.0 || v_texcoord.x> texwidth ||
> v_texcoord.y< 0.0 || v_texcoord.y> 1.0)
> discard;
>
> My concern originally when writing this code was that we would end up
> generating unnecessary ANDs on bool uniforms, so I put the ANDs right
> at the point of doing the CMPs that otherwise set only the low bit.
> However, in order to use a bool, we're generating some instruction
> anyway (e.g. moving it so as to produce a condition code update), and
> those instructions can often be turned into an AND at that point. It
> turns out in the shaders I have on hand, none of them regress in
> instruction count:
>
> Total instructions: 262649 -> 262545
> 39/2148 programs affected (1.8%)
> 14253 -> 14149 instructions in affected programs (0.7% reduction)
> ---
> src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 36 ++++++++++----------------
> 1 files changed, 14 insertions(+), 22 deletions(-)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
> index 3460d14..6315957 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
> @@ -394,7 +394,6 @@ fs_visitor::visit(ir_expression *ir)
>
> inst = emit(BRW_OPCODE_CMP, temp, op[0], op[1]);
> inst->conditional_mod = brw_conditional_for_comparison(ir->operation);
> - emit(BRW_OPCODE_AND, this->result, this->result, fs_reg(0x1));
> break;
>
> case ir_binop_logic_xor:
> @@ -440,11 +439,19 @@ fs_visitor::visit(ir_expression *ir)
> break;
> case ir_unop_i2f:
> case ir_unop_u2f:
> - case ir_unop_b2f:
> - case ir_unop_b2i:
> case ir_unop_f2i:
> emit(BRW_OPCODE_MOV, this->result, op[0]);
> break;
> +
> + case ir_unop_b2i:
> + inst = emit(BRW_OPCODE_AND, this->result, op[0], fs_reg(1));
> + break;
> + case ir_unop_b2f:
> + temp = fs_reg(this, glsl_type::int_type);
> + emit(BRW_OPCODE_AND, temp, op[0], fs_reg(1));
> + emit(BRW_OPCODE_MOV, this->result, temp);
If my memory serves me...you should be able to just do this in one
instruction: AND dst<F> op0<D> 1D. I know you can't mix float/int
_source_ types(*), but I thought the destination could be a separate type.
(*) ISA Reference/Execution Environment/Registers and Register
Regions/Execution Data Type
> + break;
> +
> case ir_unop_f2b:
> case ir_unop_i2b:
> temp = this->result;
> @@ -456,7 +463,6 @@ fs_visitor::visit(ir_expression *ir)
>
> inst = emit(BRW_OPCODE_CMP, temp, op[0], fs_reg(0.0f));
> inst->conditional_mod = BRW_CONDITIONAL_NZ;
> - inst = emit(BRW_OPCODE_AND, this->result, this->result, fs_reg(1));
> break;
>
> case ir_unop_trunc:
> @@ -1488,19 +1494,9 @@ fs_visitor::emit_bool_to_cond_code(ir_rvalue *ir)
> break;
>
> case ir_binop_logic_xor:
> - inst = emit(BRW_OPCODE_XOR, reg_null_d, op[0], op[1]);
> - inst->conditional_mod = BRW_CONDITIONAL_NZ;
> - break;
> -
> case ir_binop_logic_or:
> - inst = emit(BRW_OPCODE_OR, reg_null_d, op[0], op[1]);
> - inst->conditional_mod = BRW_CONDITIONAL_NZ;
> - break;
> -
> case ir_binop_logic_and:
> - inst = emit(BRW_OPCODE_AND, reg_null_d, op[0], op[1]);
> - inst->conditional_mod = BRW_CONDITIONAL_NZ;
> - break;
> + goto out;
>
> case ir_unop_f2b:
> if (intel->gen>= 6) {
> @@ -1541,15 +1537,11 @@ fs_visitor::emit_bool_to_cond_code(ir_rvalue *ir)
> return;
> }
>
> +out:
> ir->accept(this);
>
> - if (intel->gen>= 6) {
> - fs_inst *inst = emit(BRW_OPCODE_AND, reg_null_d, this->result, fs_reg(1));
> - inst->conditional_mod = BRW_CONDITIONAL_NZ;
> - } else {
> - fs_inst *inst = emit(BRW_OPCODE_MOV, reg_null_d, this->result);
> - inst->conditional_mod = BRW_CONDITIONAL_NZ;
> - }
> + fs_inst *inst = emit(BRW_OPCODE_AND, reg_null_d, this->result, fs_reg(1));
> + inst->conditional_mod = BRW_CONDITIONAL_NZ;
> }
>
> /**
More information about the mesa-dev
mailing list