[Mesa-dev] [PATCH 4/9] nir: Move the compare-with-zero optimizations to the late section

Mon Mar 23 20:43:42 PDT 2015

On Mon, Mar 23, 2015 at 8:34 PM, Matt Turner <mattst88 at gmail.com> wrote:
> On Mon, Mar 23, 2015 at 8:13 PM, Jason Ekstrand <jason at jlekstrand.net> wrote:
>> total instructions in shared programs: 4422307 -> 4422363 (0.00%)
>> instructions in affected programs:     4230 -> 4286 (1.32%)
>> helped:                                0
>> HURT:                                  12
>>
>> While this does hurt some things, the losses are minor and it prevents the
>> compare-with-zero optimization from fighting with ffma which is much more
>> important.
>
> Is it actually "fighting" (i.e., undoing the other pass' work) or just
> preventing some ffmas from being generated?
>
> If we did have something that would be recognized by both these and
> the ffma pattern, it'd look like
>
> fge(fadd(a, fmul(b, c)), 0.0)
>
> which we could turn into
>
> fge(ffma(a, b, c), 0.0) if ffma runs first; or
> fge(a, fneg(fmul(b, c)) otherwise
>
> I guess the first one is better for i965, since we can do that in one
> instruction. In fact, maybe we don't want to do these optimizations at
> all? I'm kind of surprised that it hurts.

Right.  In one sense it doesn't help anything because we can do a
compare with zero for free in i965.  However, losing it does hurt
quite a bit in the case where the optimization allows us to remove the
add instruction.  The problem is when the add is part of a potential
ffma in which case pulling things into the comparison keeps the more
optimized ffma peephole from actually converting to an ffma.  In this
case we keep both the add and the multiply even though we could have
done it with a ffma and a compare with zero.
--Jason

>> ---
>>  src/glsl/nir/nir_opt_algebraic.py | 11 ++++-------
>>  1 file changed, 4 insertions(+), 7 deletions(-)
>>
>> diff --git a/src/glsl/nir/nir_opt_algebraic.py b/src/glsl/nir/nir_opt_algebraic.py
>> index 84d4c72..cfb1429 100644
>> --- a/src/glsl/nir/nir_opt_algebraic.py
>> +++ b/src/glsl/nir/nir_opt_algebraic.py
>> @@ -82,10 +82,6 @@ optimizations = [
>>     (('inot', ('fge', a, b)), ('flt', a, b)),
>>     (('inot', ('ilt', a, b)), ('ige', a, b)),
>>     (('inot', ('ige', a, b)), ('ilt', a, b)),
>> -   (('flt', ('fadd', a, b), 0.0), ('flt', a, ('fneg', b))),
>> -   (('fge', ('fadd', a, b), 0.0), ('fge', a, ('fneg', b))),
>> -   (('feq', ('fadd', a, b), 0.0), ('feq', a, ('fneg', b))),
>> -   (('fne', ('fadd', a, b), 0.0), ('fne', a, ('fneg', b))),
>>     (('fge', ('fneg', ('fabs', a)), 0.0), ('feq', a, 0.0)),
>>     (('bcsel', ('flt', a, b), a, b), ('fmin', a, b)),
>>     (('bcsel', ('flt', a, b), b, a), ('fmax', a, b)),
>> @@ -163,9 +159,6 @@ optimizations = [
>>     (('iadd', a, ('isub', 0, b)), ('isub', a, b)),
>>     (('fabs', ('fsub', 0.0, a)), ('fabs', a)),
>>     (('iabs', ('isub', 0, a)), ('iabs', a)),
>> -
>> -# This one may not be exact
>> -   (('feq', ('fadd', a, b), 0.0), ('feq', a, ('fneg', b))),
>
> Separate commit?
>
>>  ]
>>
>>  # Add optimizations to handle the case where the result of a ternary is
>> @@ -194,6 +187,10 @@ for op in ['flt', 'fge', 'feq', 'fne',
>>  # they help code generation but do not necessarily produce code that is
>>  # more easily optimizable.
>>  late_optimizations = [
>> +   (('flt', ('fadd', a, b), 0.0), ('flt', a, ('fneg', b))),
>> +   (('fge', ('fadd', a, b), 0.0), ('fge', a, ('fneg', b))),
>> +   (('feq', ('fadd', a, b), 0.0), ('feq', a, ('fneg', b))),
>> +   (('fne', ('fadd', a, b), 0.0), ('fne', a, ('fneg', b))),
>>  ]
>>
>>  print nir_algebraic.AlgebraicPass("nir_opt_algebraic", optimizations).render()
>> --
>> 2.3.3