[Mesa-dev] [PATCH 3/3] i965/nir/opt_peephole_ffma: Bypass fusion if any operand of fadd and fmul is a const

Jason Ekstrand jason at jlekstrand.net
Tue Nov 10 09:47:00 PST 2015


On Tue, Nov 10, 2015 at 4:03 AM, Eduardo Lima Mitev <elima at igalia.com> wrote:
> I realized that patch 1/2 hasn't been reviewed, and this one didn't get
> R-b. Any objection to these two?

Go ahead

Reviewed-by: Jason Ekstrand <jason.ekstrand at intel.com>


> thanks,
> Eduardo
>
> On 10/23/2015 05:55 PM, Eduardo Lima Mitev wrote:
>> When both fadd and fmul instructions have at least one operand that is a
>> constant and it is only used once, the total number of instructions can
>> be reduced from 3 (1 ffma + 2 load_const) to 2 (1 fmul + 1 fadd); because
>> the constants will be progagated as immediate operands of fmul and fadd.
>>
>> This patch detects these situations and prevents fusing fmul+fadd into ffma.
>>
>> Shader-db results on i965 Haswell:
>>
>> total instructions in shared programs: 6235835 -> 6225895 (-0.16%)
>> instructions in affected programs:     1124094 -> 1114154 (-0.88%)
>> total loops in shared programs:        1979 -> 1979 (0.00%)
>> helped:                                7612
>> HURT:                                  843
>> GAINED:                                4
>> LOST:                                  0
>> ---
>>  .../drivers/dri/i965/brw_nir_opt_peephole_ffma.c   | 31 ++++++++++++++++++++++
>>  1 file changed, 31 insertions(+)
>>
>> diff --git a/src/mesa/drivers/dri/i965/brw_nir_opt_peephole_ffma.c b/src/mesa/drivers/dri/i965/brw_nir_opt_peephole_ffma.c
>> index a8448e7..c7fc15a 100644
>> --- a/src/mesa/drivers/dri/i965/brw_nir_opt_peephole_ffma.c
>> +++ b/src/mesa/drivers/dri/i965/brw_nir_opt_peephole_ffma.c
>> @@ -133,6 +133,28 @@ get_mul_for_src(nir_alu_src *src, int num_components,
>>     return alu;
>>  }
>>
>> +/**
>> + * Given a list of (at least two) nir_alu_src's, tells if any of them is a
>> + * constant value and is used only once.
>> + */
>> +static bool
>> +any_alu_src_is_a_constant(nir_alu_src srcs[])
>> +{
>> +   for (unsigned i = 0; i < 2; i++) {
>> +      if (srcs[i].src.ssa->parent_instr->type == nir_instr_type_load_const) {
>> +         nir_load_const_instr *load_const =
>> +            nir_instr_as_load_const (srcs[i].src.ssa->parent_instr);
>> +
>> +         if (list_is_single(&load_const->def.uses) &&
>> +             list_empty(&load_const->def.if_uses)) {
>> +            return true;
>> +         }
>> +      }
>> +   }
>> +
>> +   return false;
>> +}
>> +
>>  static bool
>>  brw_nir_opt_peephole_ffma_block(nir_block *block, void *void_state)
>>  {
>> @@ -183,6 +205,15 @@ brw_nir_opt_peephole_ffma_block(nir_block *block, void *void_state)
>>        mul_src[0] = mul->src[0].src.ssa;
>>        mul_src[1] = mul->src[1].src.ssa;
>>
>> +      /* If any of the operands of the fmul and any of the fadd is a constant,
>> +       * we bypass because it will be more efficient as the constants will be
>> +       * propagated as operands, potentially saving two load_const instructions.
>> +       */
>> +      if (any_alu_src_is_a_constant(mul->src) &&
>> +          any_alu_src_is_a_constant(add->src)) {
>> +         continue;
>> +      }
>> +
>>        if (abs) {
>>           for (unsigned i = 0; i < 2; i++) {
>>              nir_alu_instr *abs = nir_alu_instr_create(state->mem_ctx,
>>
>


More information about the mesa-dev mailing list