[Mesa-dev] [PATCH 7/9] nir/peephole_ffma: Be less agressive about fusing multiply-adds

Jason Ekstrand jason at jlekstrand.net
Tue Mar 24 15:45:38 PDT 2015


First off, these are all heuristics and we need to keep that in mind.

On Tue, Mar 24, 2015 at 3:39 PM, Connor Abbott <cwabbott0 at gmail.com> wrote:
> I think this is being both too aggressive and not aggressive enough.
> First, the too aggressive part. Imagine if we have something like:
>
> a = mul b, c;
> d = add a, 1.0;
> e = add a, 2.0;
> f = add a, 3.0;
>
> In this case, we don't want to turn this into a series of 3 mad's
> since the 3 load immediate instructions are going to hurt us more than
> the elimination of the mul. When the only thing using the mul is a
> bunch of add's, it's only profitable when there's at most one distinct
> immediate that isn't used by other MAD's -- any more and we gain
> instructions.

Yes and no.  Thanks to Matt's constant-combine stuff, if the constants
are used elsewhere we can consider them to be pretty-much free.  Also,
once we get VF constants working, things will get even more free.
It's never going to be 100% free but we are probably ok fusing them.
Of course, since there's only one mul, maybe not.  Heursistics!

> Also, I think that we might still want to fuse the multiply and the
> add even if the mul is used somewhere else, since at least it gives
> the scheduler some extra freedom to hide latency -- as long as we
> don't have to emit any load immediate instructions. Of course, this is
> all very backend-specific so at this point it would make sense to move
> this to an i965 pass and leave in the general ffma peephole for
> everyone else without these weird restrictions.

Yes, there are times where we probably want to fuse anyway.  Again,
it's not 100% clear when that is, and this is certainly better than we
were doing before.
--Jason

> On Mon, Mar 23, 2015 at 11:13 PM, Jason Ekstrand <jason at jlekstrand.net> wrote:
>> shader-db results for fragment shaders on Haswell:
>> total instructions in shared programs: 4395688 -> 4389623 (-0.14%)
>> instructions in affected programs:     355876 -> 349811 (-1.70%)
>> helped:                                1455
>> HURT:                                  14
>> GAINED:                                5
>> LOST:                                  0
>> ---
>>  src/glsl/nir/nir_opt_peephole_ffma.c | 41 ++++++++++++++++++++++++++++++++++++
>>  1 file changed, 41 insertions(+)
>>
>> diff --git a/src/glsl/nir/nir_opt_peephole_ffma.c b/src/glsl/nir/nir_opt_peephole_ffma.c
>> index 4674505..b875bb6 100644
>> --- a/src/glsl/nir/nir_opt_peephole_ffma.c
>> +++ b/src/glsl/nir/nir_opt_peephole_ffma.c
>> @@ -38,6 +38,41 @@ struct peephole_ffma_state {
>>     bool progress;
>>  };
>>
>> +static inline bool
>> +are_all_uses_fadd(nir_ssa_def *def)
>> +{
>> +   if (def->if_uses->entries > 0)
>> +      return false;
>> +
>> +   struct set_entry *use_iter;
>> +   set_foreach(def->uses, use_iter) {
>> +      nir_instr *use_instr = (nir_instr *)use_iter->key;
>> +
>> +      if (use_instr->type != nir_instr_type_alu)
>> +         return false;
>> +
>> +      nir_alu_instr *use_alu = nir_instr_as_alu(use_instr);
>> +      switch (use_alu->op) {
>> +      case nir_op_fadd:
>> +         break; /* This one's ok */
>> +
>> +      case nir_op_imov:
>> +      case nir_op_fmov:
>> +      case nir_op_fneg:
>> +      case nir_op_fabs:
>> +         assert(use_alu->dest.dest.is_ssa);
>> +         if (!are_all_uses_fadd(&use_alu->dest.dest.ssa))
>> +            return false;
>> +         break;
>> +
>> +      default:
>> +         return false;
>> +      }
>> +   }
>> +
>> +   return true;
>> +}
>> +
>>  static inline nir_alu_instr *
>>  get_mul_for_src(nir_alu_src *src, uint8_t swizzle[4], bool *negate, bool *abs)
>>  {
>> @@ -66,6 +101,12 @@ get_mul_for_src(nir_alu_src *src, uint8_t swizzle[4], bool *negate, bool *abs)
>>        break;
>>
>>     case nir_op_fmul:
>> +      /* Only absorbe a fmul into a ffma if the fmul is is only used in fadd
>> +       * operations.  This prevents us from being too agressive with our
>> +       * fusing which can actually lead to more instructions.
>> +       */
>> +      if (!are_all_uses_fadd(&alu->dest.dest.ssa))
>> +         return NULL;
>>        break;
>>
>>     default:
>> --
>> 2.3.3
>>
>> _______________________________________________
>> mesa-dev mailing list
>> mesa-dev at lists.freedesktop.org
>> http://lists.freedesktop.org/mailman/listinfo/mesa-dev


More information about the mesa-dev mailing list