[Mesa-dev] [PATCH] i965: Emit better code for ir_unop_sign.

Chris Forbes chrisf at ijw.co.nz
Mon Dec 2 18:46:09 PST 2013


Other than that,

Reviewed-by: Chris Forbes <chrisf at ijw.co.nz>

On Tue, Dec 3, 2013 at 3:45 PM, Chris Forbes <chrisf at ijw.co.nz> wrote:
> Where you have the magic 0x3f800000u, could you add a comment:
>
> /* 1.0f */
>
> The bit pattern is well-known, but only immediately obvious on a good day :)
>
> On Tue, Dec 3, 2013 at 2:03 PM, Kenneth Graunke <kenneth at whitecape.org> wrote:
>> On 12/02/2013 01:46 PM, Matt Turner wrote:
>>> total instructions in shared programs: 1519751 -> 1519442 (-0.02%)
>>> instructions in affected programs:     10154 -> 9845 (-3.04%)
>>> ---
>>>  src/mesa/drivers/dri/i965/brw_fs_visitor.cpp   | 23 +++++++++++++++--------
>>>  src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp | 23 ++++++++++++++++-------
>>>  2 files changed, 31 insertions(+), 15 deletions(-)
>>>
>>> diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
>>> index 9eb9a9d..a0e803b 100644
>>> --- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
>>> +++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
>>> @@ -382,18 +382,25 @@ fs_visitor::visit(ir_expression *ir)
>>>        emit(MOV(this->result, op[0]));
>>>        break;
>>>     case ir_unop_sign:
>>> -      temp = fs_reg(this, ir->type);
>>> +      if (ir->type->is_float()) {
>>> +         emit(CMP(reg_null_f, op[0], fs_reg(0.0f), BRW_CONDITIONAL_NZ));
>>>
>>> -      emit(MOV(this->result, fs_reg(0.0f)));
>>> +         op[0].type = BRW_REGISTER_TYPE_UD;
>>> +         this->result.type = BRW_REGISTER_TYPE_UD;
>>> +         emit(AND(this->result, op[0], fs_reg(0x80000000u)));
>>>
>>> -      emit(CMP(reg_null_f, op[0], fs_reg(0.0f), BRW_CONDITIONAL_G));
>>> -      inst = emit(MOV(this->result, fs_reg(1.0f)));
>>> -      inst->predicate = BRW_PREDICATE_NORMAL;
>>> +         inst = emit(OR(this->result, this->result, fs_reg(0x3f800000u)));
>>> +         inst->predicate = BRW_PREDICATE_NORMAL;
>>>
>>> -      emit(CMP(reg_null_f, op[0], fs_reg(0.0f), BRW_CONDITIONAL_L));
>>> -      inst = emit(MOV(this->result, fs_reg(-1.0f)));
>>> -      inst->predicate = BRW_PREDICATE_NORMAL;
>>> +         this->result.type = BRW_REGISTER_TYPE_F;
>>> +      } else {
>>> +         emit(CMP(reg_null_d, op[0], fs_reg(0), BRW_CONDITIONAL_G));
>>>
>>> +         emit(ASR(this->result, op[0], fs_reg(31)));
>>> +
>>> +         inst = emit(OR(this->result, this->result, fs_reg(1)));
>>> +         inst->predicate = BRW_PREDICATE_NORMAL;
>>> +      }
>>>        break;
>>>     case ir_unop_rcp:
>>>        emit_math(SHADER_OPCODE_RCP, this->result, op[0]);
>>> diff --git a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
>>> index a13eafb..986663b 100644
>>> --- a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
>>> +++ b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
>>> @@ -1259,16 +1259,25 @@ vec4_visitor::visit(ir_expression *ir)
>>>        break;
>>>
>>>     case ir_unop_sign:
>>> -      emit(MOV(result_dst, src_reg(0.0f)));
>>> +      if (ir->type->is_float()) {
>>> +         emit(CMP(dst_null_f(), op[0], src_reg(0.0f), BRW_CONDITIONAL_NZ));
>>>
>>> -      emit(CMP(dst_null_d(), op[0], src_reg(0.0f), BRW_CONDITIONAL_G));
>>> -      inst = emit(MOV(result_dst, src_reg(1.0f)));
>>> -      inst->predicate = BRW_PREDICATE_NORMAL;
>>> +         op[0].type = BRW_REGISTER_TYPE_UD;
>>> +         result_dst.type = BRW_REGISTER_TYPE_UD;
>>> +         emit(AND(result_dst, op[0], src_reg(0x80000000u)));
>>>
>>> -      emit(CMP(dst_null_d(), op[0], src_reg(0.0f), BRW_CONDITIONAL_L));
>>> -      inst = emit(MOV(result_dst, src_reg(-1.0f)));
>>> -      inst->predicate = BRW_PREDICATE_NORMAL;
>>> +         inst = emit(OR(result_dst, src_reg(result_dst), src_reg(0x3f800000u)));
>>> +         inst->predicate = BRW_PREDICATE_NORMAL;
>>> +
>>> +         this->result.type = BRW_REGISTER_TYPE_F;
>>> +      } else {
>>> +         emit(CMP(dst_null_d(), op[0], src_reg(0), BRW_CONDITIONAL_G));
>>> +
>>> +         emit(ASR(result_dst, op[0], src_reg(31)));
>>>
>>> +         inst = emit(OR(result_dst, src_reg(result_dst), src_reg(1)));
>>> +         inst->predicate = BRW_PREDICATE_NORMAL;
>>> +      }
>>>        break;
>>>
>>>     case ir_unop_rcp:
>>
>> Some comments in both cases would be great.  But this looks correct, and
>> definitely more efficient.
>>
>> Reviewed-by: Kenneth Graunke <kenneth at whitecape.org>
>> _______________________________________________
>> mesa-dev mailing list
>> mesa-dev at lists.freedesktop.org
>> http://lists.freedesktop.org/mailman/listinfo/mesa-dev


More information about the mesa-dev mailing list