[Mesa-dev] [PATCH 05/12] nir/algebraic: Flag inexact optimizations

Francisco Jerez currojerez at riseup.net
Wed Mar 23 19:22:21 UTC 2016


Jason Ekstrand <jason at jlekstrand.net> writes:

> On Mar 23, 2016 2:07 AM, "Francisco Jerez" <currojerez at riseup.net> wrote:
>>
>> Jason Ekstrand <jason at jlekstrand.net> writes:
>>
>> > Many of our optimizations, while great for cutting shaders down to size,
>> > aren't really precision-safe.  This commit tries to flag all of the
>> > inexact floating-point optimizations so they don't get run on values
> that
>> > are flagged "exact".  It's a bit conservative and maybe flags some safe
>> > optimizations as unsafe but that's better than missing one.
>> > ---
>> >  src/compiler/nir/nir_opt_algebraic.py | 156
> +++++++++++++++++-----------------
>> >  1 file changed, 78 insertions(+), 78 deletions(-)
>> >
>> > diff --git a/src/compiler/nir/nir_opt_algebraic.py
> b/src/compiler/nir/nir_opt_algebraic.py
>> > index 3e7ea06..b229f03 100644
>> > --- a/src/compiler/nir/nir_opt_algebraic.py
>> > +++ b/src/compiler/nir/nir_opt_algebraic.py
>> > @@ -61,19 +61,19 @@ optimizations = [
>> >     (('fabs', ('fneg', a)), ('fabs', a)),
>> >     (('iabs', ('iabs', a)), ('iabs', a)),
>> >     (('iabs', ('ineg', a)), ('iabs', a)),
>> > -   (('fadd', a, 0.0), a),
>> > +   (('~fadd', a, 0.0), a),
>> >     (('iadd', a, 0), a),
>> >     (('usadd_4x8', a, 0), a),
>> >     (('usadd_4x8', a, ~0), ~0),
>> > -   (('fadd', ('fmul', a, b), ('fmul', a, c)), ('fmul', a, ('fadd', b,
> c))),
>> > +   (('~fadd', ('fmul', a, b), ('fmul', a, c)), ('fmul', a, ('fadd', b,
> c))),
>> >     (('iadd', ('imul', a, b), ('imul', a, c)), ('imul', a, ('iadd', b,
> c))),
>> > -   (('fadd', ('fneg', a), a), 0.0),
>> > +   (('~fadd', ('fneg', a), a), 0.0),
>> >     (('iadd', ('ineg', a), a), 0),
>> >     (('iadd', ('ineg', a), ('iadd', a, b)), b),
>> >     (('iadd', a, ('iadd', ('ineg', a), b)), b),
>> > -   (('fadd', ('fneg', a), ('fadd', a, b)), b),
>> > -   (('fadd', a, ('fadd', ('fneg', a), b)), b),
>> > -   (('fmul', a, 0.0), 0.0),
>> > +   (('~fadd', ('fneg', a), ('fadd', a, b)), b),
>> > +   (('~fadd', a, ('fadd', ('fneg', a), b)), b),
>> > +   (('~fmul', a, 0.0), 0.0),
>> >     (('imul', a, 0), 0),
>> >     (('umul_unorm_4x8', a, 0), 0),
>> >     (('umul_unorm_4x8', a, ~0), a),
>> > @@ -81,33 +81,33 @@ optimizations = [
>> >     (('imul', a, 1), a),
>> >     (('fmul', a, -1.0), ('fneg', a)),
>> >     (('imul', a, -1), ('ineg', a)),
>> > -   (('ffma', 0.0, a, b), b),
>> > -   (('ffma', a, 0.0, b), b),
>> > -   (('ffma', a, b, 0.0), ('fmul', a, b)),
>> > -   (('ffma', a, 1.0, b), ('fadd', a, b)),
>> > -   (('ffma', 1.0, a, b), ('fadd', a, b)),
>>
>> I think 'ffma a 1 b'/'ffma 1 a b' -> 'fadd a b' could be made safe.
>
> I think that should be OK.
>
>> > -   (('flrp', a, b, 0.0), a),
>> > -   (('flrp', a, b, 1.0), b),
>> > -   (('flrp', a, a, b), a),
>> > -   (('flrp', 0.0, a, b), ('fmul', a, b)),
>> > +   (('~ffma', 0.0, a, b), b),
>> > +   (('~ffma', a, 0.0, b), b),
>> > +   (('~ffma', a, b, 0.0), ('fmul', a, b)),
>> > +   (('~ffma', a, 1.0, b), ('fadd', a, b)),
>> > +   (('~ffma', 1.0, a, b), ('fadd', a, b)),
>> > +   (('~flrp', a, b, 0.0), a),
>> > +   (('~flrp', a, b, 1.0), b),
>> > +   (('~flrp', a, a, b), a),
>> > +   (('~flrp', 0.0, a, b), ('fmul', a, b)),
>> >     (('flrp', a, b, c), ('fadd', ('fmul', c, ('fsub', b, a)), a),
> 'options->lower_flrp'),
>> >     (('ffract', a), ('fsub', a, ('ffloor', a)),
> 'options->lower_ffract'),
>> > -   (('fadd', ('fmul', a, ('fadd', 1.0, ('fneg', c))), ('fmul', b, c)),
> ('flrp', a, b, c), '!options->lower_flrp'),
>> > -   (('fadd', a, ('fmul', c, ('fadd', b, ('fneg', a)))), ('flrp', a, b,
> c), '!options->lower_flrp'),
>> > +   (('~fadd', ('fmul', a, ('fadd', 1.0, ('fneg', c))), ('fmul', b,
> c)), ('flrp', a, b, c), '!options->lower_flrp'),
>> > +   (('~fadd', a, ('fmul', c, ('fadd', b, ('fneg', a)))), ('flrp', a,
> b, c), '!options->lower_flrp'),
>> >     (('ffma', a, b, c), ('fadd', ('fmul', a, b), c),
> 'options->lower_ffma'),
>> > -   (('fadd', ('fmul', a, b), c), ('ffma', a, b, c),
> '!options->lower_ffma'),
>> > +   (('~fadd', ('fmul', a, b), c), ('ffma', a, b, c),
> '!options->lower_ffma'),
>> >     # Comparison simplifications
>> > -   (('inot', ('flt', a, b)), ('fge', a, b)),
>> > -   (('inot', ('fge', a, b)), ('flt', a, b)),
>> > -   (('inot', ('feq', a, b)), ('fne', a, b)),
>> > -   (('inot', ('fne', a, b)), ('feq', a, b)),
>> > -   (('inot', ('ilt', a, b)), ('ige', a, b)),
>> > -   (('inot', ('ige', a, b)), ('ilt', a, b)),
>> > -   (('inot', ('ieq', a, b)), ('ine', a, b)),
>> > -   (('inot', ('ine', a, b)), ('ieq', a, b)),
>>
>> What's unsafe about the four integer comparison simplifications above?
>
> Right.  I got too excited.  I'll leave those as safe.
>
>> > -   (('fge', ('fneg', ('fabs', a)), 0.0), ('feq', a, 0.0)),
>>
>> This one seems mostly safe too.
>>
>> > -   (('bcsel', ('flt', a, b), a, b), ('fmin', a, b)),
>> > -   (('bcsel', ('flt', a, b), b, a), ('fmax', a, b)),
>>
>> What are the semantics of fmin/fmax?  Are they defined as in GLSL with
>> min(NaN, x) = NaN, min(x, NaN) = x?  Or the other way around?  Or is it
>> defined to be commutative?  In the first case the code above would
>> indeed be unsafe *but* the related simplification:
>
> I dug up the GLSL spec and it says the following for min:
>
> Returns y if y < x, otherwise it returns x
>
> so min(NaN, y) = y but min(x, NaN) = NaN
>
> If we just change the optimizations to match the spec better, they would be
> safe.  Then we could add the non-safe ones as non-safe.
>
They wouldn't, you still need the two-character change I pointed out
earlier (marked with '^^^^^' now) to make it safe.

> Sadly, I think our hardware implements the DX behaviour of "give me the
> non-NaN value"
>
Our hardware can do either, CMPN and SEL.l/ge gives you the non-NaN
value while CMP and SEL.le/g gives you GLSL-like semantics.  The
back-end uses the wrong ones almost universally since
3b7f683f3bbbd93e417a6f42ec7c609465be49de except on Gen4-5 which use CMP.

                                                   vvvv
>> |     (('bcsel', ('flt', a, b), a, b), ('fmin', b, a)),
                                                   ^^^^^
>> |     (('bcsel', ('flt', a, b), b, a), ('fmax', a, b)),
>>
>> would be completely safe.
>>
>> > +   (('~inot', ('flt', a, b)), ('fge', a, b)),
>> > +   (('~inot', ('fge', a, b)), ('flt', a, b)),
>> > +   (('~inot', ('feq', a, b)), ('fne', a, b)),
>> > +   (('~inot', ('fne', a, b)), ('feq', a, b)),
>> > +   (('~inot', ('ilt', a, b)), ('ige', a, b)),
>> > +   (('~inot', ('ige', a, b)), ('ilt', a, b)),
>> > +   (('~inot', ('ieq', a, b)), ('ine', a, b)),
>> > +   (('~inot', ('ine', a, b)), ('ieq', a, b)),
>> > +   (('~fge', ('fneg', ('fabs', a)), 0.0), ('feq', a, 0.0)),
>> > +   (('~bcsel', ('flt', a, b), a, b), ('fmin', a, b)),
>> > +   (('~bcsel', ('flt', a, b), b, a), ('fmax', a, b)),
>> >     (('bcsel', ('inot', 'a at bool'), b, c), ('bcsel', a, c, b)),
>> >     (('bcsel', a, ('bcsel', a, b, c), d), ('bcsel', a, b, d)),
>> >     (('fmin', a, a), a),
>> > @@ -116,17 +116,17 @@ optimizations = [
>> >     (('imax', a, a), a),
>> >     (('umin', a, a), a),
>> >     (('umax', a, a), a),
>> > -   (('fmin', ('fmax', a, 0.0), 1.0), ('fsat', a),
> '!options->lower_fsat'),
>> > -   (('fmax', ('fmin', a, 1.0), 0.0), ('fsat', a),
> '!options->lower_fsat'),
>> > +   (('~fmin', ('fmax', a, 0.0), 1.0), ('fsat', a),
> '!options->lower_fsat'),
>> > +   (('~fmax', ('fmin', a, 1.0), 0.0), ('fsat', a),
> '!options->lower_fsat'),
>> >     (('fsat', a), ('fmin', ('fmax', a, 0.0), 1.0),
> 'options->lower_fsat'),
>> >     (('fsat', ('fsat', a)), ('fsat', a)),
>> > -   (('fsat', ('fmin', a, 1.0)), ('fsat', a)),
>> > -   (('fsat', ('fmax', a, 0.0)), ('fsat', a)),
>> > +   (('~fsat', ('fmin', a, 1.0)), ('fsat', a)),
>> > +   (('~fsat', ('fmax', a, 0.0)), ('fsat', a)),
>> >     (('fmin', ('fmax', ('fmin', ('fmax', a, 0.0), 1.0), 0.0), 1.0),
> ('fmin', ('fmax', a, 0.0), 1.0)),
>> > -   (('ior', ('flt', a, b), ('flt', a, c)), ('flt', a, ('fmax', b, c))),
>> > -   (('ior', ('flt', a, c), ('flt', b, c)), ('flt', ('fmin', a, b), c)),
>> > -   (('ior', ('fge', a, b), ('fge', a, c)), ('fge', a, ('fmin', b, c))),
>> > -   (('ior', ('fge', a, c), ('fge', b, c)), ('fge', ('fmax', a, b), c)),
>>
>> These would be safe if fmin/fmax were commutative, which I don't think
>> is the case?
>
> They are not commutative :-(
>
>> > +   (('~ior', ('flt', a, b), ('flt', a, c)), ('flt', a, ('fmax', b,
> c))),
>> > +   (('~ior', ('flt', a, c), ('flt', b, c)), ('flt', ('fmin', a, b),
> c)),
>> > +   (('~ior', ('fge', a, b), ('fge', a, c)), ('fge', a, ('fmin', b,
> c))),
>> > +   (('~ior', ('fge', a, c), ('fge', b, c)), ('fge', ('fmax', a, b),
> c)),
>> >     (('slt', a, b), ('b2f', ('flt', a, b)), 'options->lower_scmp'),
>> >     (('sge', a, b), ('b2f', ('fge', a, b)), 'options->lower_scmp'),
>> >     (('seq', a, b), ('b2f', ('feq', a, b)), 'options->lower_scmp'),
>> > @@ -150,15 +150,15 @@ optimizations = [
>> >     (('ult', a, a), False),
>> >     (('uge', a, a), True),
>> >     # Logical and bit operations
>> > -   (('fand', a, 0.0), 0.0),
>> > +   (('~fand', a, 0.0), 0.0),
>>
>> This is safe if fand(x, y) is defined as (x != 0 && y != 0 ? 1 : 0).
>> Could you remind me how it is defined?
>
> Yes, that is the definition.
>
>> >     (('iand', a, a), a),
>> >     (('iand', a, ~0), a),
>> >     (('iand', a, 0), 0),
>> >     (('ior', a, a), a),
>> >     (('ior', a, 0), a),
>> > -   (('fxor', a, a), 0.0),
>> > +   (('~fxor', a, a), 0.0),
>>
>> Same goes here.
>
> Right.
>
>> >     (('ixor', a, a), 0),
>> > -   (('fxor', a, 0.0), a),
>> > +   (('~fxor', a, 0.0), a),
>> >     (('ixor', a, 0), a),
>> >     (('inot', ('inot', a)), a),
>> >     # DeMorgan's Laws
>> > @@ -174,35 +174,35 @@ optimizations = [
>> >     (('iand', 0xff, ('ushr', a, 24)), ('ushr', a, 24)),
>> >     (('iand', 0xffff, ('ushr', a, 16)), ('ushr', a, 16)),
>> >     # Exponential/logarithmic identities
>> > -   (('fexp2', ('flog2', a)), a), # 2^lg2(a) = a
>> > -   (('flog2', ('fexp2', a)), a), # lg2(2^a) = a
>> > +   (('~fexp2', ('flog2', a)), a), # 2^lg2(a) = a
>> > +   (('~flog2', ('fexp2', a)), a), # lg2(2^a) = a
>> >     (('fpow', a, b), ('fexp2', ('fmul', ('flog2', a), b)),
> 'options->lower_fpow'), # a^b = 2^(lg2(a)*b)
>> > -   (('fexp2', ('fmul', ('flog2', a), b)), ('fpow', a, b),
> '!options->lower_fpow'), # 2^(lg2(a)*b) = a^b
>> > -   (('fexp2', ('fadd', ('fmul', ('flog2', a), b), ('fmul', ('flog2',
> c), d))),
>> > -    ('fmul', ('fpow', a, b), ('fpow', c, d)), '!options->lower_fpow'),
> # 2^(lg2(a) * b + lg2(c) + d) = a^b * c^d
>> > -   (('fpow', a, 1.0), a),
>> > -   (('fpow', a, 2.0), ('fmul', a, a)),
>> > -   (('fpow', a, 4.0), ('fmul', ('fmul', a, a), ('fmul', a, a))),
>> > -   (('fpow', 2.0, a), ('fexp2', a)),
>> > -   (('fpow', ('fpow', a, 2.2), 0.454545), a),
>> > -   (('fpow', ('fabs', ('fpow', a, 2.2)), 0.454545), ('fabs', a)),
>> > -   (('fsqrt', ('fexp2', a)), ('fexp2', ('fmul', 0.5, a))),
>> > -   (('frcp', ('fexp2', a)), ('fexp2', ('fneg', a))),
>> > -   (('frsq', ('fexp2', a)), ('fexp2', ('fmul', -0.5, a))),
>> > -   (('flog2', ('fsqrt', a)), ('fmul', 0.5, ('flog2', a))),
>> > -   (('flog2', ('frcp', a)), ('fneg', ('flog2', a))),
>> > -   (('flog2', ('frsq', a)), ('fmul', -0.5, ('flog2', a))),
>> > -   (('flog2', ('fpow', a, b)), ('fmul', b, ('flog2', a))),
>> > -   (('fadd', ('flog2', a), ('flog2', b)), ('flog2', ('fmul', a, b))),
>> > -   (('fadd', ('flog2', a), ('fneg', ('flog2', b))), ('flog2', ('fdiv',
> a, b))),
>> > -   (('fmul', ('fexp2', a), ('fexp2', b)), ('fexp2', ('fadd', a, b))),
>> > +   (('~fexp2', ('fmul', ('flog2', a), b)), ('fpow', a, b),
> '!options->lower_fpow'), # 2^(lg2(a)*b) = a^b
>> > +   (('~fexp2', ('fadd', ('fmul', ('flog2', a), b), ('fmul', ('flog2',
> c), d))),
>> > +    ('~fmul', ('fpow', a, b), ('fpow', c, d)),
> '!options->lower_fpow'), # 2^(lg2(a) * b + lg2(c) + d) = a^b * c^d
>> > +   (('~fpow', a, 1.0), a),
>> > +   (('~fpow', a, 2.0), ('fmul', a, a)),
>> > +   (('~fpow', a, 4.0), ('fmul', ('fmul', a, a), ('fmul', a, a))),
>> > +   (('~fpow', 2.0, a), ('fexp2', a)),
>> > +   (('~fpow', ('fpow', a, 2.2), 0.454545), a),
>> > +   (('~fpow', ('fabs', ('fpow', a, 2.2)), 0.454545), ('fabs', a)),
>> > +   (('~fsqrt', ('fexp2', a)), ('fexp2', ('fmul', 0.5, a))),
>> > +   (('~frcp', ('fexp2', a)), ('fexp2', ('fneg', a))),
>> > +   (('~frsq', ('fexp2', a)), ('fexp2', ('fmul', -0.5, a))),
>> > +   (('~flog2', ('fsqrt', a)), ('fmul', 0.5, ('flog2', a))),
>> > +   (('~flog2', ('frcp', a)), ('fneg', ('flog2', a))),
>> > +   (('~flog2', ('frsq', a)), ('fmul', -0.5, ('flog2', a))),
>> > +   (('~flog2', ('fpow', a, b)), ('fmul', b, ('flog2', a))),
>> > +   (('~fadd', ('flog2', a), ('flog2', b)), ('flog2', ('fmul', a, b))),
>> > +   (('~fadd', ('flog2', a), ('fneg', ('flog2', b))), ('flog2',
> ('fdiv', a, b))),
>> > +   (('~fmul', ('fexp2', a), ('fexp2', b)), ('fexp2', ('fadd', a, b))),
>> >     # Division and reciprocal
>> > -   (('fdiv', 1.0, a), ('frcp', a)),
>> > -   (('fdiv', a, b), ('fmul', a, ('frcp', b)), 'options->lower_fdiv'),
>>
>> This one is safe and I guess necessary if the back-end requires fdiv to
>> be lowered?
>
> Thanks. I must have gotten too eager to flag things.
>
>> > -   (('frcp', ('frcp', a)), a),
>> > -   (('frcp', ('fsqrt', a)), ('frsq', a)),
>> > -   (('fsqrt', a), ('frcp', ('frsq', a)), 'options->lower_fsqrt'),
>>
>> Same here.
>
> Yup
>
>> > -   (('frcp', ('frsq', a)), ('fsqrt', a), '!options->lower_fsqrt'),
>> > +   (('~fdiv', 1.0, a), ('frcp', a)),
>> > +   (('~fdiv', a, b), ('fmul', a, ('frcp', b)), 'options->lower_fdiv'),
>> > +   (('~frcp', ('frcp', a)), a),
>> > +   (('~frcp', ('fsqrt', a)), ('frsq', a)),
>> > +   (('~fsqrt', a), ('frcp', ('frsq', a)), 'options->lower_fsqrt'),
>> > +   (('~frcp', ('frsq', a)), ('fsqrt', a), '!options->lower_fsqrt'),
>> >     # Boolean simplifications
>> >     (('ieq', 'a at bool', True), a),
>> >     (('ine', 'a at bool', True), ('inot', a)),
>> > @@ -221,8 +221,8 @@ optimizations = [
>> >
>> >     # Conversions
>> >     (('i2b', ('b2i', a)), a),
>> > -   (('f2i', ('ftrunc', a)), ('f2i', a)),
>> > -   (('f2u', ('ftrunc', a)), ('f2u', a)),
>> > +   (('~f2i', ('ftrunc', a)), ('f2i', a)),
>> > +   (('~f2u', ('ftrunc', a)), ('f2u', a)),
>> >
>> These seem safe to me.
> I think they probably are.  I think there may be a problem when the number
> of mantissa bits is bigger than the number of integer bits.  In that case,
> the out-of-bounds behaviour may get interesting.  However, we're not
> handling that case yet so I think it's safe.
>
Hm, I don't think I understand the problem, if the integer part of 'a'
is outside the representable range of the integer type, is that not
going to be the case still after 'ftrunc()'?

>> >     # Byte extraction
>> >     (('ushr', a, 24), ('extract_u8', a, 3),
> '!options->lower_extract_byte'),
>> > @@ -235,17 +235,17 @@ optimizations = [
>> >     (('iand', 0xffff, a), ('extract_u16', a, 0),
> '!options->lower_extract_word'),
>> >
>> >     # Subtracts
>> > -   (('fsub', a, ('fsub', 0.0, b)), ('fadd', a, b)),
>> > +   (('~fsub', a, ('fsub', 0.0, b)), ('fadd', a, b)),
>> >     (('isub', a, ('isub', 0, b)), ('iadd', a, b)),
>> >     (('ussub_4x8', a, 0), a),
>> >     (('ussub_4x8', a, ~0), 0),
>> > -   (('fsub', a, b), ('fadd', a, ('fneg', b)), 'options->lower_sub'),
>> > +   (('~fsub', a, b), ('fadd', a, ('fneg', b)), 'options->lower_sub'),
>>
>> Required for lowering.
>
> Yup
>
>> >     (('isub', a, b), ('iadd', a, ('ineg', b)), 'options->lower_sub'),
>> > -   (('fneg', a), ('fsub', 0.0, a), 'options->lower_negate'),
>> > +   (('~fneg', a), ('fsub', 0.0, a), 'options->lower_negate'),
>>
>> Same here.
>
> Yup
>
>> >     (('ineg', a), ('isub', 0, a), 'options->lower_negate'),
>> > -   (('fadd', a, ('fsub', 0.0, b)), ('fsub', a, b)),
>> > +   (('~fadd', a, ('fsub', 0.0, b)), ('fsub', a, b)),
>> >     (('iadd', a, ('isub', 0, b)), ('isub', a, b)),
>> > -   (('fabs', ('fsub', 0.0, a)), ('fabs', a)),
>> > +   (('~fabs', ('fsub', 0.0, a)), ('fabs', a)),
>>
>> This one seems safe.
>
> Depends on whether or not a + 0.0 == a.  I think it should but glennk
> seemed to think it wasn't.  I'd rather be on the safe side for now.

Heh, the reason I pointed that out is that because of the fabs() around
the subtraction it doesn't actually matter whether '0 - a == -a' or not.
The only case where the equality may not hold is for a=+0, because
'(+0) - (+0)' might be +0 or -0 depending on the rounding behaviour of the
implementation.  However the end result is the same because fabs(-0) ==
fabs(+0).

>
>> >     (('iabs', ('isub', 0, a)), ('iabs', a)),
>> >
>> >     # Misc. lowering
>> > @@ -372,10 +372,10 @@ for op in ['flt', 'fge', 'feq', 'fne',
>> >  # they help code generation but do not necessarily produce code that is
>> >  # more easily optimizable.
>> >  late_optimizations = [
>> > -   (('flt', ('fadd', a, b), 0.0), ('flt', a, ('fneg', b))),
>>
>> This one also seems safe,
> I've come to the conclusion that it isn't.  If you have a -inf and inf,
> then a+b is going to be one of NaN, inf, or 0.  In any case the expression
> on the left is commutative while the one on the right isn't.
>
Yeah, well, they may not be equivalent but only if the implementation
doesn't do IEEE-compliant NaN handling and it doesn't flush them to zero
either -- In either of those cases the LHS evaluates to false while the
RHS does too because the two infs have opposite signs.  It might be
useful to add some sort of compiler option for the back-end to specify
its preferred NaN handling rules?

>> > -   (('fge', ('fadd', a, b), 0.0), ('fge', a, ('fneg', b))),
>> > -   (('feq', ('fadd', a, b), 0.0), ('feq', a, ('fneg', b))),
>> > -   (('fne', ('fadd', a, b), 0.0), ('fne', a, ('fneg', b))),
>>
>> ...but these other ones are indeed unsafe for some sign combinations of
>> a and b when they are not finite.
>>
>> > +   (('~flt', ('fadd', a, b), 0.0), ('flt', a, ('fneg', b))),
>> > +   (('~fge', ('fadd', a, b), 0.0), ('fge', a, ('fneg', b))),
>> > +   (('~feq', ('fadd', a, b), 0.0), ('feq', a, ('fneg', b))),
>> > +   (('~fne', ('fadd', a, b), 0.0), ('fne', a, ('fneg', b))),
>> >     (('fdot2', a, b), ('fdot_replicated2', a, b),
> 'options->fdot_replicates'),
>> >     (('fdot3', a, b), ('fdot_replicated3', a, b),
> 'options->fdot_replicates'),
>> >     (('fdot4', a, b), ('fdot_replicated4', a, b),
> 'options->fdot_replicates'),
>>
>> I've also looked through the remaining algebraic optimizations in this
>> pass, and things look safe except for the ones you have already marked
>> unsafe in this patch -- I'll wait until you answer the questions I
>> pointed out before I send you a R-b for it though.
>>
>> > --
>> > 2.5.0.400.gff86faf
>> >
>> > _______________________________________________
>> > mesa-dev mailing list
>> > mesa-dev at lists.freedesktop.org
>> > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 212 bytes
Desc: not available
URL: <https://lists.freedesktop.org/archives/mesa-dev/attachments/20160323/096af589/attachment-0001.sig>


More information about the mesa-dev mailing list