[Mesa-dev] [PATCH 4/9] nir: Move the compare-with-zero optimizations to the late section

Matt Turner mattst88 at gmail.com
Tue Mar 31 11:04:26 PDT 2015


On Mon, Mar 23, 2015 at 8:43 PM, Jason Ekstrand <jason at jlekstrand.net> wrote:
> On Mon, Mar 23, 2015 at 8:34 PM, Matt Turner <mattst88 at gmail.com> wrote:
>> On Mon, Mar 23, 2015 at 8:13 PM, Jason Ekstrand <jason at jlekstrand.net> wrote:
>>> total instructions in shared programs: 4422307 -> 4422363 (0.00%)
>>> instructions in affected programs:     4230 -> 4286 (1.32%)
>>> helped:                                0
>>> HURT:                                  12
>>>
>>> While this does hurt some things, the losses are minor and it prevents the
>>> compare-with-zero optimization from fighting with ffma which is much more
>>> important.
>>
>> Is it actually "fighting" (i.e., undoing the other pass' work) or just
>> preventing some ffmas from being generated?
>>
>> If we did have something that would be recognized by both these and
>> the ffma pattern, it'd look like
>>
>> fge(fadd(a, fmul(b, c)), 0.0)
>>
>> which we could turn into
>>
>> fge(ffma(a, b, c), 0.0) if ffma runs first; or
>> fge(a, fneg(fmul(b, c)) otherwise
>>
>> I guess the first one is better for i965, since we can do that in one
>> instruction. In fact, maybe we don't want to do these optimizations at
>> all? I'm kind of surprised that it hurts.
>
> Right.  In one sense it doesn't help anything because we can do a
> compare with zero for free in i965.  However, losing it does hurt
> quite a bit in the case where the optimization allows us to remove the
> add instruction.  The problem is when the add is part of a potential
> ffma in which case pulling things into the comparison keeps the more
> optimized ffma peephole from actually converting to an ffma.  In this
> case we keep both the add and the multiply even though we could have
> done it with a ffma and a compare with zero.

So to confirm, in the case of

> (('flt', ('fadd', a, b), 0.0), ('flt', a, ('fneg', b))),

you want to keep the a+b around so that if a or b is a multiplication,
the ffma peephole can recognize it?

If that's the case,

Reviewed-by: Matt Turner <mattst88 at gmail.com>


More information about the mesa-dev mailing list