[Mesa-dev] [PATCH 11/22] nir: Recognize some more open-coded fmin / fmax

Jason Ekstrand jason at jlekstrand.net
Mon Feb 26 22:43:38 UTC 2018


On Mon, Feb 26, 2018 at 2:21 PM, Ian Romanick <idr at freedesktop.org> wrote:

> On 02/23/2018 05:14 PM, Jason Ekstrand wrote:
> > On Fri, Feb 23, 2018 at 3:55 PM, Ian Romanick <idr at freedesktop.org
> > <mailto:idr at freedesktop.org>> wrote:
> >
> >     From: Ian Romanick <ian.d.romanick at intel.com
> >     <mailto:ian.d.romanick at intel.com>>
> >
> >     shader-db results:
> >
> >     Haswell, Broadwell, and Skylake had similar results. (Skylake shown)
> >     total instructions in shared programs: 14514817 -> 14514808 (<.01%)
> >     instructions in affected programs: 229 -> 220 (-3.93%)
> >     helped: 3
> >     HURT: 0
> >     helped stats (abs) min: 1 max: 4 x̄: 3.00 x̃: 4
> >     helped stats (rel) min: 2.86% max: 4.12% x̄: 3.70% x̃: 4.12%
> >
> >     total cycles in shared programs: 533145211 -> 533144939 (<.01%)
> >     cycles in affected programs: 37268 -> 36996 (-0.73%)
> >     helped: 8
> >     HURT: 0
> >     helped stats (abs) min: 2 max: 134 x̄: 34.00 x̃: 2
> >     helped stats (rel) min: 0.02% max: 14.22% x̄: 3.53% x̃: 0.05%
> >
> >     Sandy Bridge and Ivy Bridge had similar results. (Ivy Bridge shown)
> >     total cycles in shared programs: 257618409 -> 257618403 (<.01%)
> >     cycles in affected programs: 12582 -> 12576 (-0.05%)
> >     helped: 3
> >     HURT: 0
> >     helped stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2
> >     helped stats (rel) min: 0.05% max: 0.05% x̄: 0.05% x̃: 0.05%
> >
> >     No changes on Iron Lake or GM45.
> >
> >     Signed-off-by: Ian Romanick <ian.d.romanick at intel.com
> >     <mailto:ian.d.romanick at intel.com>>
> >     ---
> >      src/compiler/nir/nir_opt_algebraic.py | 2 ++
> >      1 file changed, 2 insertions(+)
> >
> >     diff --git a/src/compiler/nir/nir_opt_algebraic.py
> >     b/src/compiler/nir/nir_opt_algebraic.py
> >     index d40d59b..f5f9e94 100644
> >     --- a/src/compiler/nir/nir_opt_algebraic.py
> >     +++ b/src/compiler/nir/nir_opt_algebraic.py
> >     @@ -170,6 +170,8 @@ optimizations = [
> >         (('fge', ('fneg', ('fabs', a)), 0.0), ('feq', a, 0.0)),
> >         (('bcsel', ('flt', b, a), b, a), ('fmin', a, b)),
> >         (('bcsel', ('flt', a, b), b, a), ('fmax', a, b)),
> >     +   (('bcsel', ('fge', b, a), a, b), ('fmin', a, b)),
> >     +   (('bcsel', ('fge', a, b), a, b), ('fmax', a, b)),
> >
> >
> > Please flag as inexact.  As per the stupid GLSL definition, these are
> > not the same as fmin/fmax when you throw in a NaN.
>
> I'm having some trouble rectifying this with the existing
> transformations and the Intel hardware implementation.
>

Me too. :-)  Really, I think D3D10 spec'd the right thing. With GL, it
looks more like a case of someone wrote the obvious thing down when NaN
wasn't a thing and it got extended to NaN without much thought.  How would
you feel about just defining the NIR fmin/fmax to have the D3D10 behavior
and telling people that they can make new opcodes if they want something
different?  At the very least, that would make it constant-fold
consistently for us.  It also sounds like it's a correct transformation as
per the spec comment below.

GLSL spec says min(x, y) "Returns y if y < x; otherwise it returns x."
> From that I infer min(x, NaN) == x, and min(NaN, y) == NaN.  The
> expression ('bcsel', ('flt', b, a), b, a) has the same behavior.
>
> I think if I rewrite the fmin transform as (swapping the argument order)
>
>     (('bcsel', ('fge', a, b), b, a), ('fmin', a, b)),
>
> it should be at least as valid for as the existing transforms.  A
> similar modification should work for fmax.
>

Yes, it should.


> The Intel SEL instruction which says that with the .L or .GE modifier,
> if one argument is NaN, the other value is always returned.  This means
> that min(NaN, y) will be y.
>
> This is valid for min and max because section 4.7.1 (Range and
> Precision) says:
>
>     Operations and built-in functions that operate on a NaN are not
>     required to return a NaN as the result.
>

It's good to know that the DX behavior is considered a valid
transformation.  I didn't know exactly what the rules were there.


> I don't think returning non-NaN for ('bcsel', ('flt', b, NaN), b, NaN)
> is valid, so I think the existing transformations should also be marked
> inexact for platforms that implement the "never NaN" behavior for fmin
> or fmax.
>

I think I agree.  I think you could probably argue the point and I doubt
real apps would care but...  Probably best to flag it as inexact and move
on with life.  If you're using precise and open-coding min/max in some
horifically stupid way, you're doing it wrong.


> >         (('bcsel', ('inot', a), b, c), ('bcsel', a, c, b)),
> >         (('bcsel', a, ('bcsel', a, b, c), d), ('bcsel', a, b, d)),
> >         (('bcsel', a, True, 'b at bool'), ('ior', a, b)),
> >     --
> >     2.9.5
> >
> >     _______________________________________________
> >     mesa-dev mailing list
> >     mesa-dev at lists.freedesktop.org <mailto:mesa-dev at lists.
> freedesktop.org>
> >     https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> >     <https://lists.freedesktop.org/mailman/listinfo/mesa-dev>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/mesa-dev/attachments/20180226/8d50591c/attachment.html>


More information about the mesa-dev mailing list