[Mesa-dev] [PATCH 3/3] i965/fs: Optimize float conversions of byte/word extract.

Matt Turner mattst88 at gmail.com
Thu Mar 3 16:10:25 UTC 2016


On Thu, Mar 3, 2016 at 7:21 AM, Iago Toral <itoral at igalia.com> wrote:
> On Wed, 2016-03-02 at 15:45 -0800, Matt Turner wrote:
>>    instructions in affected programs: 31535 -> 29966 (-4.98%)
>>    helped: 23
>>
>>    cycles in affected programs: 272648 -> 266022 (-2.43%)
>>    helped: 14
>>    HURT: 1
>>
>> The patch decreases the number of instructions in the two Unigine
>> programs by:
>>
>>  #1721: 4374 -> 4155 instructions (-5.01%)
>>  #1706: 3582 -> 3363 instructions (-6.11%)
>> ---
>>  src/mesa/drivers/dri/i965/brw_fs.h       |  2 ++
>>  src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 46 ++++++++++++++++++++++++++++++++
>>  2 files changed, 48 insertions(+)
>>
>> diff --git a/src/mesa/drivers/dri/i965/brw_fs.h b/src/mesa/drivers/dri/i965/brw_fs.h
>> index 7446ca1..21c7813 100644
>> --- a/src/mesa/drivers/dri/i965/brw_fs.h
>> +++ b/src/mesa/drivers/dri/i965/brw_fs.h
>> @@ -272,6 +272,8 @@ public:
>>     void emit_percomp(const brw::fs_builder &bld, const fs_inst &inst,
>>                       unsigned wr_mask);
>>
>> +   bool optimize_extract_to_float(nir_alu_instr *instr,
>> +                                  const fs_reg &result);
>>     bool optimize_frontfacing_ternary(nir_alu_instr *instr,
>>                                       const fs_reg &result);
>>
>> diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
>> index db20c71..04e9b8f 100644
>> --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
>> +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
>> @@ -500,6 +500,49 @@ fs_visitor::nir_emit_instr(nir_instr *instr)
>>     }
>>  }
>>
>> +/**
>> + * Recognizes a parent instruction of nir_op_extract_* and changes the type to
>> + * match instr.
>> + */
>> +bool
>> +fs_visitor::optimize_extract_to_float(nir_alu_instr *instr,
>> +                                      const fs_reg &result)
>> +{
>> +   if (!instr->src[0].src.is_ssa ||
>> +       !instr->src[0].src.ssa->parent_instr)
>> +      return false;
>> +
>> +   if (instr->src[0].src.ssa->parent_instr->type != nir_instr_type_alu)
>> +      return false;
>> +
>> +   nir_alu_instr *src0 =
>> +      nir_instr_as_alu(instr->src[0].src.ssa->parent_instr);
>> +
>> +   if (src0->op != nir_op_extract_u8 && src0->op != nir_op_extract_u16 &&
>> +       src0->op != nir_op_extract_i8 && src0->op != nir_op_extract_i16)
>> +      return false;
>> +
>> +   nir_const_value *element = nir_src_as_const_value(src0->src[1].src);
>> +   assert(element != NULL);
>> +
>> +   enum opcode extract_op;
>> +   if (src0->op == nir_op_extract_u16 || src0->op == nir_op_extract_i16) {
>> +      assert(element->u[0] <= 1);
>> +      extract_op = SHADER_OPCODE_EXTRACT_WORD;
>> +   } else {
>> +      assert(element->u[0] <= 3);
>> +      extract_op = SHADER_OPCODE_EXTRACT_BYTE;
>> +   }
>> +
>> +   fs_reg op0 = get_nir_src(src0->src[0].src);
>> +   op0.type = brw_type_for_nir_type(nir_op_infos[src0->op].input_types[0]);
>> +   op0 = offset(op0, bld, src0->src[0].swizzle[0]);
>> +
>> +   set_saturate(instr->dest.saturate,
>> +                bld.emit(extract_op, result, op0, brw_imm_ud(element->u[0])));
>
> So this relies on dead code elimination to remove the original extract
> opcode, right?

Exactly right.

> Series is:
> Reviewed-by: Iago Toral Quiroga <itoral at igalia.com>

Thanks!


More information about the mesa-dev mailing list