[Mesa-dev] [PATCH v3 39/42] intel/compiler: remove MAD/LRP algebraic optimizations from the backend

Fri Jan 18 04:31:33 UTC 2019

On Thu, Jan 17, 2019 at 6:42 PM Matt Turner <mattst88 at gmail.com> wrote:

> On Tue, Jan 15, 2019 at 5:54 AM Iago Toral Quiroga <itoral at igalia.com>
> wrote:
> >
> > NIR already has these so they are redundant. A run of shader-db confirms
> > that the only cases where these backend optimizations are activated
> > are some Tomb Raider shaders where the affected variables are qualified
> > as "precise", which is why NIR won't apply them and why the backend
> > shouldn't either (so it is actually a bug).
>
> Which of the six optimizations that you're removing were responsible
> for the change? I ask because...
>

If it's one of the precise ones, we should port it to NIR...

> >
> > Suggested-by: Jason Ekstrand <jason at jlekstrand.net>
> > ---
> >  src/intel/compiler/brw_fs.cpp | 37 -----------------------------------
> >  1 file changed, 37 deletions(-)
> >
> > diff --git a/src/intel/compiler/brw_fs.cpp
> b/src/intel/compiler/brw_fs.cpp
> > index 77c955ac435..e7f5a8822a3 100644
> > --- a/src/intel/compiler/brw_fs.cpp
> > +++ b/src/intel/compiler/brw_fs.cpp
> > @@ -2568,16 +2568,6 @@ fs_visitor::opt_algebraic()
> >              break;
> >           }
> >           break;
> > -      case BRW_OPCODE_LRP:
> > -         if (inst->src[1].equals(inst->src[2])) {
> > -            inst->opcode = BRW_OPCODE_MOV;
> > -            inst->src[0] = inst->src[1];
> > -            inst->src[1] = reg_undef;
> > -            inst->src[2] = reg_undef;
> > -            progress = true;
> > -            break;
>
> I'm not sure whether this is imprecise, and...
>

Doesn't work for NaN or either inf, at least not unles inf - inf == 0 which
I don't think it is.

> > -         }
> > -         break;
> >        case BRW_OPCODE_CMP:
> >           if ((inst->conditional_mod == BRW_CONDITIONAL_Z ||
> >                inst->conditional_mod == BRW_CONDITIONAL_NZ) &&
> > @@ -2654,33 +2644,6 @@ fs_visitor::opt_algebraic()
> >              }
> >           }
> >           break;
> > -      case BRW_OPCODE_MAD:
> > -         if (inst->src[1].is_zero() || inst->src[2].is_zero()) {
> > -            inst->opcode = BRW_OPCODE_MOV;
> > -            inst->src[1] = reg_undef;
> > -            inst->src[2] = reg_undef;
> > -            progress = true;
> > -         } else if (inst->src[0].is_zero()) {
> > -            inst->opcode = BRW_OPCODE_MUL;
> > -            inst->src[0] = inst->src[2];
> > -            inst->src[2] = reg_undef;
> > -            progress = true;
> > -         } else if (inst->src[1].is_one()) {
> > -            inst->opcode = BRW_OPCODE_ADD;
> > -            inst->src[1] = inst->src[2];
> > -            inst->src[2] = reg_undef;
> > -            progress = true;
> > -         } else if (inst->src[2].is_one()) {
> > -            inst->opcode = BRW_OPCODE_ADD;
> > -            inst->src[2] = reg_undef;
> > -            progress = true;
> > -         } else if (inst->src[1].file == IMM && inst->src[2].file ==
> IMM) {
> > -            inst->opcode = BRW_OPCODE_ADD;
> > -            inst->src[1].f *= inst->src[2].f;
> > -            inst->src[2] = reg_undef;
> > -            progress = true;
>
> or this one.
>

Yes, it is.  Part of the point of FMA is that it's more precise than
mul+add because the mul is done with extra precision and added to src[0] in
high-precision before the final rounding.  This optimization explicitly
breaks the MAD into mul+add.

--Jason
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/mesa-dev/attachments/20190117/a5102cdc/attachment-0001.html>