[Mesa-dev] [PATCH 17/28] intel/nir: call nir_opt_constant_folding before nir_opt_algebraic is executed

Mon Dec 10 13:32:21 UTC 2018

On Mon, Dec 10, 2018 at 2:20 PM Samuel Iglesias Gonsálvez
<siglesias at igalia.com> wrote:
>
> On 05/12/2018 19:22, Connor Abbott wrote:
> > Why is this needed? In general, we shouldn't be relying on
> > optimization ordering for correctness. It probably just means one of
> > the optimizations is wrong, and you're working around that.
>
> There is an issue when the shader tries to generate a NaN using 0/0,
> opt_algebraic is replaces the 0 * NaN part for 0.
>
> This is an example of denorm * NaN code, although it is happening for
> other tests too. The optimization takes this code:
>
>         vec1 32 ssa_0 = load_const (0x00000000 /* 0.000000 */)
>         vec1 32 ssa_2 = load_const (0x00800000 /* 0.000000 */)
>         vec1 32 ssa_3 = load_const (0x008003f0 /* 0.000000 */)
>         vec1 32 ssa_4 = fneg ssa_2
>         vec1 32 ssa_5 = fadd ssa_3, ssa_4 <<<< generate denorm
>         vec1 32 ssa_7 = frcp ssa_0
>         vec1 32 ssa_8 = fmul ssa_0, ssa_7 <<< generate NaN
>         vec1 32 ssa_10 = fmul ssa_5, ssa_8 <<<< denorm * NaN
>
> and converts it to the following:
>
>         vec1 32 ssa_0 = load_const (0x00000000 /* 0.000000 */)
>         vec1 32 ssa_2 = load_const (0x00800000 /* 0.000000 */)
>         vec1 32 ssa_3 = load_const (0x008003f0 /* 0.000000 */)
>         vec1 32 ssa_4 = fneg ssa_2
>         vec1 32 ssa_5 = fadd ssa_3, ssa_4   <<<< build denorm
>         vec1 32 ssa_7 = frcp ssa_0
>         vec1 32 ssa_18 = load_const (0x00000000 /* 0.000000 */)
>         vec1 32 ssa_19 = imov ssa_18       <<<< fmul (a, 0) = 0, so we miss NaN!
>         vec1 32 ssa_10 = fmul ssa_5, ssa_19 <<<< denorm * 0
>
> Is it possible to represent fmul (a, 0) = 0 if 'a' is not NaN? Or
> fmul(a, NaN) = NaN? Do you have any other suggestion?

It sounds like another imprecise optimization that doesn't respect
NaN's in opt_algebraic. Either we have to mark everything as precise
when we must preserve NaN's (or flush denorms, or preserve denorms, or
...), or we have to do an audit of all the imprecise optimizations in
opt_algebraic, figure out which execution modes they don't respect,
and disable them when the offending execution modes are set. For
example, we could disable (fmul a, 0) -> 0 only when NaN's must be
preserved. We could also add weaker optimizations that are precise and
hence always enabled, like limiting that transform to when a is known
not to be NaN, but that's something for later.

>
> Sam
>
> > On Wed, Dec 5, 2018 at 4:56 PM Samuel Iglesias Gonsálvez
> > <siglesias at igalia.com> wrote:
> >>
> >> This would do constant folding and also flush to zero denorms operands before
> >> the nir_opt_algebraic is executed.
> >>
> >> Signed-off-by: Samuel Iglesias Gonsálvez <siglesias at igalia.com>
> >> ---
> >>  src/intel/compiler/brw_nir.c | 2 +-
> >>  1 file changed, 1 insertion(+), 1 deletion(-)
> >>
> >> diff --git a/src/intel/compiler/brw_nir.c b/src/intel/compiler/brw_nir.c
> >> index 0a5aa35c700..600f7a97df9 100644
> >> --- a/src/intel/compiler/brw_nir.c
> >> +++ b/src/intel/compiler/brw_nir.c
> >> @@ -570,8 +570,8 @@ brw_nir_optimize(nir_shader *nir, const struct brw_compiler *compiler,
> >>        OPT(nir_opt_cse);
> >>        OPT(nir_opt_peephole_select, 0);
> >>        OPT(nir_opt_intrinsics);
> >> -      OPT(nir_opt_algebraic);
> >>        OPT(nir_opt_constant_folding);
> >> +      OPT(nir_opt_algebraic);
> >>        OPT(nir_opt_dead_cf);
> >>        if (OPT(nir_opt_trivial_continues)) {
> >>           /* If nir_opt_trivial_continues makes progress, then we need to clean
> >> --
> >> 2.19.1
> >>
> >> _______________________________________________
> >> mesa-dev mailing list
> >> mesa-dev at lists.freedesktop.org
> >> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> >
>