[Mesa-dev] [PATCH 11/22] nir: Recognize some more open-coded fmin / fmax
Jason Ekstrand
jason at jlekstrand.net
Mon Feb 26 22:43:38 UTC 2018
On Mon, Feb 26, 2018 at 2:21 PM, Ian Romanick <idr at freedesktop.org> wrote:
> On 02/23/2018 05:14 PM, Jason Ekstrand wrote:
> > On Fri, Feb 23, 2018 at 3:55 PM, Ian Romanick <idr at freedesktop.org
> > <mailto:idr at freedesktop.org>> wrote:
> >
> > From: Ian Romanick <ian.d.romanick at intel.com
> > <mailto:ian.d.romanick at intel.com>>
> >
> > shader-db results:
> >
> > Haswell, Broadwell, and Skylake had similar results. (Skylake shown)
> > total instructions in shared programs: 14514817 -> 14514808 (<.01%)
> > instructions in affected programs: 229 -> 220 (-3.93%)
> > helped: 3
> > HURT: 0
> > helped stats (abs) min: 1 max: 4 x̄: 3.00 x̃: 4
> > helped stats (rel) min: 2.86% max: 4.12% x̄: 3.70% x̃: 4.12%
> >
> > total cycles in shared programs: 533145211 -> 533144939 (<.01%)
> > cycles in affected programs: 37268 -> 36996 (-0.73%)
> > helped: 8
> > HURT: 0
> > helped stats (abs) min: 2 max: 134 x̄: 34.00 x̃: 2
> > helped stats (rel) min: 0.02% max: 14.22% x̄: 3.53% x̃: 0.05%
> >
> > Sandy Bridge and Ivy Bridge had similar results. (Ivy Bridge shown)
> > total cycles in shared programs: 257618409 -> 257618403 (<.01%)
> > cycles in affected programs: 12582 -> 12576 (-0.05%)
> > helped: 3
> > HURT: 0
> > helped stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2
> > helped stats (rel) min: 0.05% max: 0.05% x̄: 0.05% x̃: 0.05%
> >
> > No changes on Iron Lake or GM45.
> >
> > Signed-off-by: Ian Romanick <ian.d.romanick at intel.com
> > <mailto:ian.d.romanick at intel.com>>
> > ---
> > src/compiler/nir/nir_opt_algebraic.py | 2 ++
> > 1 file changed, 2 insertions(+)
> >
> > diff --git a/src/compiler/nir/nir_opt_algebraic.py
> > b/src/compiler/nir/nir_opt_algebraic.py
> > index d40d59b..f5f9e94 100644
> > --- a/src/compiler/nir/nir_opt_algebraic.py
> > +++ b/src/compiler/nir/nir_opt_algebraic.py
> > @@ -170,6 +170,8 @@ optimizations = [
> > (('fge', ('fneg', ('fabs', a)), 0.0), ('feq', a, 0.0)),
> > (('bcsel', ('flt', b, a), b, a), ('fmin', a, b)),
> > (('bcsel', ('flt', a, b), b, a), ('fmax', a, b)),
> > + (('bcsel', ('fge', b, a), a, b), ('fmin', a, b)),
> > + (('bcsel', ('fge', a, b), a, b), ('fmax', a, b)),
> >
> >
> > Please flag as inexact. As per the stupid GLSL definition, these are
> > not the same as fmin/fmax when you throw in a NaN.
>
> I'm having some trouble rectifying this with the existing
> transformations and the Intel hardware implementation.
>
Me too. :-) Really, I think D3D10 spec'd the right thing. With GL, it
looks more like a case of someone wrote the obvious thing down when NaN
wasn't a thing and it got extended to NaN without much thought. How would
you feel about just defining the NIR fmin/fmax to have the D3D10 behavior
and telling people that they can make new opcodes if they want something
different? At the very least, that would make it constant-fold
consistently for us. It also sounds like it's a correct transformation as
per the spec comment below.
GLSL spec says min(x, y) "Returns y if y < x; otherwise it returns x."
> From that I infer min(x, NaN) == x, and min(NaN, y) == NaN. The
> expression ('bcsel', ('flt', b, a), b, a) has the same behavior.
>
> I think if I rewrite the fmin transform as (swapping the argument order)
>
> (('bcsel', ('fge', a, b), b, a), ('fmin', a, b)),
>
> it should be at least as valid for as the existing transforms. A
> similar modification should work for fmax.
>
Yes, it should.
> The Intel SEL instruction which says that with the .L or .GE modifier,
> if one argument is NaN, the other value is always returned. This means
> that min(NaN, y) will be y.
>
> This is valid for min and max because section 4.7.1 (Range and
> Precision) says:
>
> Operations and built-in functions that operate on a NaN are not
> required to return a NaN as the result.
>
It's good to know that the DX behavior is considered a valid
transformation. I didn't know exactly what the rules were there.
> I don't think returning non-NaN for ('bcsel', ('flt', b, NaN), b, NaN)
> is valid, so I think the existing transformations should also be marked
> inexact for platforms that implement the "never NaN" behavior for fmin
> or fmax.
>
I think I agree. I think you could probably argue the point and I doubt
real apps would care but... Probably best to flag it as inexact and move
on with life. If you're using precise and open-coding min/max in some
horifically stupid way, you're doing it wrong.
> > (('bcsel', ('inot', a), b, c), ('bcsel', a, c, b)),
> > (('bcsel', a, ('bcsel', a, b, c), d), ('bcsel', a, b, d)),
> > (('bcsel', a, True, 'b at bool'), ('ior', a, b)),
> > --
> > 2.9.5
> >
> > _______________________________________________
> > mesa-dev mailing list
> > mesa-dev at lists.freedesktop.org <mailto:mesa-dev at lists.
> freedesktop.org>
> > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> > <https://lists.freedesktop.org/mailman/listinfo/mesa-dev>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/mesa-dev/attachments/20180226/8d50591c/attachment.html>
More information about the mesa-dev
mailing list