[Mesa-dev] [PATCH 11/22] nir: Recognize some more open-coded fmin / fmax

Mon Feb 26 22:45:59 UTC 2018

On 02/26/2018 02:43 PM, Jason Ekstrand wrote:
> On Mon, Feb 26, 2018 at 2:21 PM, Ian Romanick <idr at freedesktop.org
> <mailto:idr at freedesktop.org>> wrote:
> 
>     On 02/23/2018 05:14 PM, Jason Ekstrand wrote:
>     > On Fri, Feb 23, 2018 at 3:55 PM, Ian Romanick <idr at freedesktop.org <mailto:idr at freedesktop.org>
>     > <mailto:idr at freedesktop.org <mailto:idr at freedesktop.org>>> wrote:
>     >
>     >     From: Ian Romanick <ian.d.romanick at intel.com <mailto:ian.d.romanick at intel.com>
>     >     <mailto:ian.d.romanick at intel.com
>     <mailto:ian.d.romanick at intel.com>>>
>     >
>     >     shader-db results:
>     >
>     >     Haswell, Broadwell, and Skylake had similar results. (Skylake shown)
>     >     total instructions in shared programs: 14514817 -> 14514808 (<.01%)
>     >     instructions in affected programs: 229 -> 220 (-3.93%)
>     >     helped: 3
>     >     HURT: 0
>     >     helped stats (abs) min: 1 max: 4 x̄: 3.00 x̃: 4
>     >     helped stats (rel) min: 2.86% max: 4.12% x̄: 3.70% x̃: 4.12%
>     >
>     >     total cycles in shared programs: 533145211 -> 533144939 (<.01%)
>     >     cycles in affected programs: 37268 -> 36996 (-0.73%)
>     >     helped: 8
>     >     HURT: 0
>     >     helped stats (abs) min: 2 max: 134 x̄: 34.00 x̃: 2
>     >     helped stats (rel) min: 0.02% max: 14.22% x̄: 3.53% x̃: 0.05%
>     >
>     >     Sandy Bridge and Ivy Bridge had similar results. (Ivy Bridge shown)
>     >     total cycles in shared programs: 257618409 -> 257618403 (<.01%)
>     >     cycles in affected programs: 12582 -> 12576 (-0.05%)
>     >     helped: 3
>     >     HURT: 0
>     >     helped stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2
>     >     helped stats (rel) min: 0.05% max: 0.05% x̄: 0.05% x̃: 0.05%
>     >
>     >     No changes on Iron Lake or GM45.
>     >
>     >     Signed-off-by: Ian Romanick <ian.d.romanick at intel.com <mailto:ian.d.romanick at intel.com>
>     >     <mailto:ian.d.romanick at intel.com
>     <mailto:ian.d.romanick at intel.com>>>
>     >     ---
>     >      src/compiler/nir/nir_opt_algebraic.py | 2 ++
>     >      1 file changed, 2 insertions(+)
>     >
>     >     diff --git a/src/compiler/nir/nir_opt_algebraic.py
>     >     b/src/compiler/nir/nir_opt_algebraic.py
>     >     index d40d59b..f5f9e94 100644
>     >     --- a/src/compiler/nir/nir_opt_algebraic.py
>     >     +++ b/src/compiler/nir/nir_opt_algebraic.py
>     >     @@ -170,6 +170,8 @@ optimizations = [
>     >         (('fge', ('fneg', ('fabs', a)), 0.0), ('feq', a, 0.0)),
>     >         (('bcsel', ('flt', b, a), b, a), ('fmin', a, b)),
>     >         (('bcsel', ('flt', a, b), b, a), ('fmax', a, b)),
>     >     +   (('bcsel', ('fge', b, a), a, b), ('fmin', a, b)),
>     >     +   (('bcsel', ('fge', a, b), a, b), ('fmax', a, b)),
>     >
>     >
>     > Please flag as inexact.  As per the stupid GLSL definition, these are
>     > not the same as fmin/fmax when you throw in a NaN.
> 
>     I'm having some trouble rectifying this with the existing
>     transformations and the Intel hardware implementation.
> 
> 
> Me too. :-)  Really, I think D3D10 spec'd the right thing. With GL, it
> looks more like a case of someone wrote the obvious thing down when NaN
> wasn't a thing and it got extended to NaN without much thought.  How
> would you feel about just defining the NIR fmin/fmax to have the D3D10
> behavior and telling people that they can make new opcodes if they want
> something different?  At the very least, that would make it
> constant-fold consistently for us.  It also sounds like it's a correct
> transformation as per the spec comment below.
> 
>     GLSL spec says min(x, y) "Returns y if y < x; otherwise it returns x."
>     From that I infer min(x, NaN) == x, and min(NaN, y) == NaN.  The
>     expression ('bcsel', ('flt', b, a), b, a) has the same behavior.
> 
>     I think if I rewrite the fmin transform as (swapping the argument order)
> 
>         (('bcsel', ('fge', a, b), b, a), ('fmin', a, b)),
> 
>     it should be at least as valid for as the existing transforms.  A
>     similar modification should work for fmax.
> 
> 
> Yes, it should.
>  
> 
>     The Intel SEL instruction which says that with the .L or .GE modifier,
>     if one argument is NaN, the other value is always returned.  This means
>     that min(NaN, y) will be y.
> 
>     This is valid for min and max because section 4.7.1 (Range and
>     Precision) says:
> 
>         Operations and built-in functions that operate on a NaN are not
>         required to return a NaN as the result.
> 
> 
> It's good to know that the DX behavior is considered a valid
> transformation.  I didn't know exactly what the rules were there.
>  
> 
>     I don't think returning non-NaN for ('bcsel', ('flt', b, NaN), b, NaN)
>     is valid, so I think the existing transformations should also be marked
>     inexact for platforms that implement the "never NaN" behavior for fmin
>     or fmax.
> 
> 
> I think I agree.  I think you could probably argue the point and I doubt
> real apps would care but...  Probably best to flag it as inexact and
> move on with life.  If you're using precise and open-coding min/max in
> some horifically stupid way, you're doing it wrong.

Okay.  I'll update this patch, and I'll add follow-up patch to mark the
others inexact.

>     >         (('bcsel', ('inot', a), b, c), ('bcsel', a, c, b)),
>     >         (('bcsel', a, ('bcsel', a, b, c), d), ('bcsel', a, b, d)),
>     >         (('bcsel', a, True, 'b at bool'), ('ior', a, b)),
>     >     --
>     >     2.9.5
>     >
>     >     _______________________________________________
>     >     mesa-dev mailing list
>     >     mesa-dev at lists.freedesktop.org
>     <mailto:mesa-dev at lists.freedesktop.org>
>     <mailto:mesa-dev at lists.freedesktop.org
>     <mailto:mesa-dev at lists.freedesktop.org>>
>     >     https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>     <https://lists.freedesktop.org/mailman/listinfo/mesa-dev>
>     >     <https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>     <https://lists.freedesktop.org/mailman/listinfo/mesa-dev>>