[Mesa-dev] [PATCH] nv50/ir: remove dnz flag when converting MAD to ADD due to optimizations
Karol Herbst
kherbst at redhat.com
Sun Nov 25 03:08:48 UTC 2018
yeah, sounds fine. I wasn't 100% sure what the dnz flag does, with the
addition below: Reviewed-by: Karol Herbst <kherbst at redhat.com>
diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
index 307d8762506..202faf0746a 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
@@ -1094,6 +1094,7 @@ ConstantFolding::opnd(Instruction *i,
ImmediateValue &imm0, int s)
if (imm0.isNegative())
i->src(t).mod = i->src(t).mod ^ Modifier(NV50_IR_MOD_NEG);
i->op = OP_ADD;
+ i->dnz = 0;
i->setSrc(s, i->getSrc(t));
i->src(s).mod = i->src(t).mod;
} else
shader:
FRAG
PROPERTY FS_COORD_ORIGIN UPPER_LEFT
PROPERTY MUL_ZERO_WINS 1
DCL IN[0], COLOR, COLOR
DCL IN[1], TEXCOORD[0], PERSPECTIVE
DCL OUT[0], COLOR
DCL OUT[1], COLOR[1]
DCL CONST[0][0..129]
DCL TEMP[0..2]
IMM[0] FLT32 { -0.0000, -1.0000, 2.0000, -0.5000}
0: ADD TEMP[0].x, -CONST[0][112].yyyy, IN[1].wwww
1: CMP TEMP[0], TEMP[0].xxxx, IMM[0].yyyy, IMM[0].xxxx
2: KILL_IF TEMP[0]
3: MUL TEMP[0].xyz, CONST[0][0], IN[0]
4: MOV TEMP[0].w, IN[0].wwww
5: MUL TEMP[1].xyz, TEMP[0], IMM[0].zzzz
6: MUL OUT[0].w, TEMP[0].wwww, CONST[0][0].wwww
7: MAD_SAT TEMP[0].w, IN[1].xxxx, CONST[0][128].xxxx, CONST[0][128].yyyy
8: MUL TEMP[0].w, TEMP[0].wwww, CONST[0][129].wwww
9: MOV TEMP[2].z, IMM[0].zzzz
10: MAD TEMP[0].xyz, TEMP[2].zzzz, -TEMP[0], CONST[0][129]
11: MAD OUT[0].xyz, TEMP[0].wwww, TEMP[0], TEMP[1]
12: MOV OUT[1], -IMM[0].wwwy
13: END
On Sun, Nov 25, 2018 at 3:58 AM Ilia Mirkin <imirkin at alum.mit.edu> wrote:
>
> dnz flag only applies for multiplications (e.g. to make 0 * Infinity
> becomes 0 instead of NaN). Once we optimize a MAD into an ADD, the dnz
> flag no longer makes sense, and upsets the GM107 emitter (since it looks
> at the ftz and dnz flags together).
>
> Signed-off-by: Ilia Mirkin <imirkin at alum.mit.edu>
> ---
> src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp | 2 ++
> 1 file changed, 2 insertions(+)
>
> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
> index 04d26dcbf53..307d8762506 100644
> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
> @@ -740,6 +740,7 @@ ConstantFolding::expr(Instruction *i,
> // restrictions, so move it into a separate LValue.
> bld.setPosition(i, false);
> i->op = OP_ADD;
> + i->dnz = 0;
> i->setSrc(1, bld.mkMov(bld.getSSA(type), i->getSrc(0), type)->getDef(0));
> i->setSrc(0, i->getSrc(2));
> i->src(0).mod = i->src(2).mod;
> @@ -1131,6 +1132,7 @@ ConstantFolding::opnd(Instruction *i, ImmediateValue &imm0, int s)
> i->setSrc(1, i->getSrc(2));
> i->src(1).mod = i->src(2).mod;
> i->setSrc(2, NULL);
> + i->dnz = 0;
> i->op = OP_ADD;
> } else
> if (!isFloatType(i->dType) && !i->subOp && !i->src(t).mod && !i->src(2).mod) {
> --
> 2.18.1
>
> _______________________________________________
> mesa-dev mailing list
> mesa-dev at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
More information about the mesa-dev
mailing list