[Mesa-dev] [PATCH 2/6] nir: Turn bcsel of +/- 1.0 and 0.0 into b2f sequences.

Kenneth Graunke kenneth at whitecape.org
Wed Aug 10 18:24:18 UTC 2016


On Wednesday, August 10, 2016 10:02:12 AM PDT Erik Faye-Lund wrote:
> On Wed, Aug 10, 2016 at 4:30 AM, Kenneth Graunke <kenneth at whitecape.org> wrote:
> > On Haswell (GL 3.3):
> >
> > total instructions in shared programs: 6208759 -> 6203860 (-0.08%)
> > instructions in affected programs: 856541 -> 851642 (-0.57%)
> > helped: 3157
> > HURT: 113
> > LOST:   7
> > GAINED: 15
> >
> > On Broadwell (GL 4.4):
> >
> > total instructions in shared programs: 11637854 -> 11632016 (-0.05%)
> > instructions in affected programs: 1055693 -> 1049855 (-0.55%)
> > helped: 3900
> > HURT: 176
> > LOST:   1
> > GAINED: 18
> >
> > Signed-off-by: Kenneth Graunke <kenneth at whitecape.org>
> > ---
> >  src/compiler/nir/nir_opt_algebraic.py | 4 ++++
> >  1 file changed, 4 insertions(+)
> >
> > diff --git a/src/compiler/nir/nir_opt_algebraic.py b/src/compiler/nir/nir_opt_algebraic.py
> > index 1cf614c..4e9896f 100644
> > --- a/src/compiler/nir/nir_opt_algebraic.py
> > +++ b/src/compiler/nir/nir_opt_algebraic.py
> > @@ -251,6 +251,10 @@ optimizations = [
> >     (('ieq', 'a at bool', False), ('inot', 'a')),
> >     (('bcsel', a, True, False), ('ine', a, 0)),
> >     (('bcsel', a, False, True), ('ieq', a, 0)),
> > +   (('bcsel at 32', a, 1.0, 0.0), ('b2f', ('ine', a, 0))),
> > +   (('bcsel at 32', a, 0.0, 1.0), ('b2f', ('ieq', a, 0))),
> > +   (('bcsel at 32', a, -1.0, -0.0), ('fneg', ('b2f', ('ine', a, 0)))),
> > +   (('bcsel at 32', a, -0.0, -1.0), ('fneg', ('b2f', ('ieq', a, 0)))),
> >     (('bcsel', True, b, c), b),
> >     (('bcsel', False, b, c), c),
> >     # The result of this should be hit by constant propagation and, in the
> 
> Same as the previous patch, this smells like intel-isms. Hardware that
> has native bcsel with support for two inline immediates will do better
> without.

It definitely feels a little strange replacing a single bcsel with
a fneg/b2f/ine, as that's three operations instead of one.

I expect the ine to go away - assuming 'a' is a properly formatted
boolean (0 or 0xFFFFFFFF), "ine a 0" will just become 'a'.  ieq would
turn into inot.  If the boolean was a comparison, the inot could be
folded in - i.e. inot(flt(a,b)) -> fge(a,b).  Or, some GPUs can handle
boolean negation as a source modifier, so it might be free there too.

Floating point negation can usually be done as a source modifier.

For reference, here's a shader snippet from Goat Simulator which
prompted me to write this optimization:

const vec4 LocalConst1 = vec4(0.250000, -0.250000, 0.000000, 1.000000);

void main()
{
...
InstrHelpTemp.r = ( ( Temporary1.r >= 0.0 ) ? LocalConst1.b : LocalConst1.a );
...
}

which could be turned into

InstrHelpTemp.r = float(Temporary1.r < 0.0);

which seems arguably better, regardless of hardware.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: This is a digitally signed message part.
URL: <https://lists.freedesktop.org/archives/mesa-dev/attachments/20160810/d7adee0c/attachment.sig>


More information about the mesa-dev mailing list