[Mesa-dev] [PATCH 2/6] nir: Turn bcsel of +/- 1.0 and 0.0 into b2f sequences.
Jason Ekstrand
jason at jlekstrand.net
Wed Aug 10 15:48:46 UTC 2016
On Aug 10, 2016 1:02 AM, "Erik Faye-Lund" <kusmabite at gmail.com> wrote:
>
> On Wed, Aug 10, 2016 at 4:30 AM, Kenneth Graunke <kenneth at whitecape.org>
wrote:
> > On Haswell (GL 3.3):
> >
> > total instructions in shared programs: 6208759 -> 6203860 (-0.08%)
> > instructions in affected programs: 856541 -> 851642 (-0.57%)
> > helped: 3157
> > HURT: 113
> > LOST: 7
> > GAINED: 15
> >
> > On Broadwell (GL 4.4):
> >
> > total instructions in shared programs: 11637854 -> 11632016 (-0.05%)
> > instructions in affected programs: 1055693 -> 1049855 (-0.55%)
> > helped: 3900
> > HURT: 176
> > LOST: 1
> > GAINED: 18
> >
> > Signed-off-by: Kenneth Graunke <kenneth at whitecape.org>
> > ---
> > src/compiler/nir/nir_opt_algebraic.py | 4 ++++
> > 1 file changed, 4 insertions(+)
> >
> > diff --git a/src/compiler/nir/nir_opt_algebraic.py
b/src/compiler/nir/nir_opt_algebraic.py
> > index 1cf614c..4e9896f 100644
> > --- a/src/compiler/nir/nir_opt_algebraic.py
> > +++ b/src/compiler/nir/nir_opt_algebraic.py
> > @@ -251,6 +251,10 @@ optimizations = [
> > (('ieq', 'a at bool', False), ('inot', 'a')),
> > (('bcsel', a, True, False), ('ine', a, 0)),
> > (('bcsel', a, False, True), ('ieq', a, 0)),
> > + (('bcsel at 32', a, 1.0, 0.0), ('b2f', ('ine', a, 0))),
> > + (('bcsel at 32', a, 0.0, 1.0), ('b2f', ('ieq', a, 0))),
> > + (('bcsel at 32', a, -1.0, -0.0), ('fneg', ('b2f', ('ine', a, 0)))),
> > + (('bcsel at 32', a, -0.0, -1.0), ('fneg', ('b2f', ('ieq', a, 0)))),
> > (('bcsel', True, b, c), b),
> > (('bcsel', False, b, c), c),
> > # The result of this should be hit by constant propagation and, in
the
>
> Same as the previous patch, this smells like intel-isms. Hardware that
> has native bcsel with support for two inline immediates will do better
> without.
Why? If you're back-end handles b2f *worse* then bcsel of two components,
then it's broken. Also, we have a lot of optimization in NIR to help cut
through "fake booleans" where shaders use 0.0 and 1.0 and math operations
instead of actual booleans. Recognising hand-coded b2f suddenly enables
more of these optimization paths to run on the shader. That 0.57% isn't
all just immediates being removed.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/mesa-dev/attachments/20160810/f1a89e4b/attachment.html>
More information about the mesa-dev
mailing list