[Mesa-dev] [PATCH v3 39/42] intel/compiler: remove MAD/LRP algebraic optimizations from the backend

Fri Jan 18 07:29:45 UTC 2019

On Thu, 2019-01-17 at 22:31 -0600, Jason Ekstrand wrote:
> On Thu, Jan 17, 2019 at 6:42 PM Matt Turner <mattst88 at gmail.com>
> wrote:
> > On Tue, Jan 15, 2019 at 5:54 AM Iago Toral Quiroga <
> > itoral at igalia.com> wrote:
> > 
> > >
> > 
> > > NIR already has these so they are redundant. A run of shader-db
> > confirms
> > 
> > > that the only cases where these backend optimizations are
> > activated
> > 
> > > are some Tomb Raider shaders where the affected variables are
> > qualified
> > 
> > > as "precise", which is why NIR won't apply them and why the
> > backend
> > 
> > > shouldn't either (so it is actually a bug).
> > 
> > 
> > 
> > Which of the six optimizations that you're removing were
> > responsible
> > 
> > for the change? I ask because...
> 
> If it's one of the precise ones, we should port it to NIR...
> 
> > >
> > 
> > > Suggested-by: Jason Ekstrand <jason at jlekstrand.net>
> > 
> > > ---
> > 
> > >  src/intel/compiler/brw_fs.cpp | 37 ---------------------------
> > --------
> > 
> > >  1 file changed, 37 deletions(-)
> > 
> > >
> > 
> > > diff --git a/src/intel/compiler/brw_fs.cpp
> > b/src/intel/compiler/brw_fs.cpp
> > 
> > > index 77c955ac435..e7f5a8822a3 100644
> > 
> > > --- a/src/intel/compiler/brw_fs.cpp
> > 
> > > +++ b/src/intel/compiler/brw_fs.cpp
> > 
> > > @@ -2568,16 +2568,6 @@ fs_visitor::opt_algebraic()
> > 
> > >              break;
> > 
> > >           }
> > 
> > >           break;
> > 
> > > -      case BRW_OPCODE_LRP:
> > 
> > > -         if (inst->src[1].equals(inst->src[2])) {
> > 
> > > -            inst->opcode = BRW_OPCODE_MOV;
> > 
> > > -            inst->src[0] = inst->src[1];
> > 
> > > -            inst->src[1] = reg_undef;
> > 
> > > -            inst->src[2] = reg_undef;
> > 
> > > -            progress = true;
> > 
> > > -            break;
> > 
> > 
> > 
> > I'm not sure whether this is imprecise, and...
> 
> Doesn't work for NaN or either inf, at least not unles inf - inf == 0
> which I don't think it is.

This exists in NIR algebraic and is marked as inexact there (plus it is
not triggered by anything in shader-db)
> > > -         }
> > 
> > > -         break;
> > 
> > >        case BRW_OPCODE_CMP:
> > 
> > >           if ((inst->conditional_mod == BRW_CONDITIONAL_Z ||
> > 
> > >                inst->conditional_mod == BRW_CONDITIONAL_NZ) &&
> > 
> > > @@ -2654,33 +2644,6 @@ fs_visitor::opt_algebraic()
> > 
> > >              }
> > 
> > >           }
> > 
> > >           break;
> > 
> > > -      case BRW_OPCODE_MAD:
> > 
> > > -         if (inst->src[1].is_zero() || inst->src[2].is_zero()) {
> > 
> > > -            inst->opcode = BRW_OPCODE_MOV;
> > 
> > > -            inst->src[1] = reg_undef;
> > 
> > > -            inst->src[2] = reg_undef;
> > 
> > > -            progress = true;
> > 
> > > -         } else if (inst->src[0].is_zero()) {
> > 
> > > -            inst->opcode = BRW_OPCODE_MUL;
> > 
> > > -            inst->src[0] = inst->src[2];
> > 
> > > -            inst->src[2] = reg_undef;
> > 
> > > -            progress = true;

I believe these were the only cases triggered by the Tomb Raider
shaders. These optimizations exist in NIR and  are marked as imprecise
there too.
> > > -         } else if (inst->src[1].is_one()) {
> > 
> > > -            inst->opcode = BRW_OPCODE_ADD;
> > 
> > > -            inst->src[1] = inst->src[2];
> > 
> > > -            inst->src[2] = reg_undef;
> > 
> > > -            progress = true;
> > 
> > > -         } else if (inst->src[2].is_one()) {
> > 
> > > -            inst->opcode = BRW_OPCODE_ADD;
> > 
> > > -            inst->src[2] = reg_undef;
> > 
> > > -            progress = true;

These also exist in NIR, but I see they are not marked as imprecise
there, not sure why, looks like a bug to me right?
> > > -         } else if (inst->src[1].file == IMM && inst-
> > >src[2].file == IMM) {
> > 
> > > -            inst->opcode = BRW_OPCODE_ADD;
> > 
> > > -            inst->src[1].f *= inst->src[2].f;
> > 
> > > -            inst->src[2] = reg_undef;
> > 
> > > -            progress = true;
> > 
> > 

This one doesn't exist in NIR but it clearly breaks precise
requirements since it its manually breaking  MAD into MUL + ADD, which
has different precision.
> or this one.
> 
> Yes, it is.  Part of the point of FMA is that it's more precise than
> mul+add because the mul is done with extra precision and added to
> src[0] in high-precision before the final rounding.  This
> optimization explicitly breaks the MAD into mul+add.--Jason
> 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/mesa-dev/attachments/20190118/9b00b2b7/attachment.html>