[Mesa-dev] [PATCH v3 39/42] intel/compiler: remove MAD/LRP algebraic optimizations from the backend
Jason Ekstrand
jason at jlekstrand.net
Fri Jan 18 04:31:33 UTC 2019
On Thu, Jan 17, 2019 at 6:42 PM Matt Turner <mattst88 at gmail.com> wrote:
> On Tue, Jan 15, 2019 at 5:54 AM Iago Toral Quiroga <itoral at igalia.com>
> wrote:
> >
> > NIR already has these so they are redundant. A run of shader-db confirms
> > that the only cases where these backend optimizations are activated
> > are some Tomb Raider shaders where the affected variables are qualified
> > as "precise", which is why NIR won't apply them and why the backend
> > shouldn't either (so it is actually a bug).
>
> Which of the six optimizations that you're removing were responsible
> for the change? I ask because...
>
If it's one of the precise ones, we should port it to NIR...
> >
> > Suggested-by: Jason Ekstrand <jason at jlekstrand.net>
> > ---
> > src/intel/compiler/brw_fs.cpp | 37 -----------------------------------
> > 1 file changed, 37 deletions(-)
> >
> > diff --git a/src/intel/compiler/brw_fs.cpp
> b/src/intel/compiler/brw_fs.cpp
> > index 77c955ac435..e7f5a8822a3 100644
> > --- a/src/intel/compiler/brw_fs.cpp
> > +++ b/src/intel/compiler/brw_fs.cpp
> > @@ -2568,16 +2568,6 @@ fs_visitor::opt_algebraic()
> > break;
> > }
> > break;
> > - case BRW_OPCODE_LRP:
> > - if (inst->src[1].equals(inst->src[2])) {
> > - inst->opcode = BRW_OPCODE_MOV;
> > - inst->src[0] = inst->src[1];
> > - inst->src[1] = reg_undef;
> > - inst->src[2] = reg_undef;
> > - progress = true;
> > - break;
>
> I'm not sure whether this is imprecise, and...
>
Doesn't work for NaN or either inf, at least not unles inf - inf == 0 which
I don't think it is.
> > - }
> > - break;
> > case BRW_OPCODE_CMP:
> > if ((inst->conditional_mod == BRW_CONDITIONAL_Z ||
> > inst->conditional_mod == BRW_CONDITIONAL_NZ) &&
> > @@ -2654,33 +2644,6 @@ fs_visitor::opt_algebraic()
> > }
> > }
> > break;
> > - case BRW_OPCODE_MAD:
> > - if (inst->src[1].is_zero() || inst->src[2].is_zero()) {
> > - inst->opcode = BRW_OPCODE_MOV;
> > - inst->src[1] = reg_undef;
> > - inst->src[2] = reg_undef;
> > - progress = true;
> > - } else if (inst->src[0].is_zero()) {
> > - inst->opcode = BRW_OPCODE_MUL;
> > - inst->src[0] = inst->src[2];
> > - inst->src[2] = reg_undef;
> > - progress = true;
> > - } else if (inst->src[1].is_one()) {
> > - inst->opcode = BRW_OPCODE_ADD;
> > - inst->src[1] = inst->src[2];
> > - inst->src[2] = reg_undef;
> > - progress = true;
> > - } else if (inst->src[2].is_one()) {
> > - inst->opcode = BRW_OPCODE_ADD;
> > - inst->src[2] = reg_undef;
> > - progress = true;
> > - } else if (inst->src[1].file == IMM && inst->src[2].file ==
> IMM) {
> > - inst->opcode = BRW_OPCODE_ADD;
> > - inst->src[1].f *= inst->src[2].f;
> > - inst->src[2] = reg_undef;
> > - progress = true;
>
> or this one.
>
Yes, it is. Part of the point of FMA is that it's more precise than
mul+add because the mul is done with extra precision and added to src[0] in
high-precision before the final rounding. This optimization explicitly
breaks the MAD into mul+add.
--Jason
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/mesa-dev/attachments/20190117/a5102cdc/attachment-0001.html>
More information about the mesa-dev
mailing list