Mesa (nv-xmad-v5-rebased): nv50/ir: optimize multiplication by 16-bit immediates into two xmads
GitLab Mirror
gitlab-mirror at kemper.freedesktop.org
Mon Aug 27 12:58:28 UTC 2018
Module: Mesa
Branch: nv-xmad-v5-rebased
Commit: d27c7918916cdc8092959124955f887592e37d72
URL: http://cgit.freedesktop.org/mesa/mesa/commit/?id=d27c7918916cdc8092959124955f887592e37d72
Author: Rhys Perry <pendingchaos02 at gmail.com>
Date: Sat Aug 18 15:06:50 2018 +0100
nv50/ir: optimize multiplication by 16-bit immediates into two xmads
Rather than the usual three that would be created.
total instructions in shared programs : 5796385 -> 5786560 (-0.17%)
total gprs used in shared programs : 670103 -> 669968 (-0.02%)
total shared used in shared programs : 548832 -> 548832 (0.00%)
total local used in shared programs : 21164 -> 21068 (-0.45%)
local shared gpr inst bytes
helped 1 0 64 1040 1040
hurt 0 0 27 0 0
Signed-off-by: Rhys Perry <pendingchaos02 at gmail.com>
Reviewed-by: Karol Herbst <kherbst at redhat.com>
---
src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp | 10 ++++++++++
1 file changed, 10 insertions(+)
diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
index 1ab743705a..ecb4bae2a8 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
@@ -997,6 +997,16 @@ ConstantFolding::createMul(DataType ty, Value *def, Value *a, int64_t b, Value *
return true;
}
+ if (typeSizeof(ty) == 4 && b >= 0 && b <= 0xffff &&
+ target->isOpSupported(OP_XMAD, TYPE_U32)) {
+ Value *tmp = bld.mkOp3v(OP_XMAD, TYPE_U32, bld.getSSA(),
+ a, bld.mkImm((uint32_t)b), c ? c : bld.mkImm(0));
+ bld.mkOp3(OP_XMAD, TYPE_U32, def, a, bld.mkImm((uint32_t)b), tmp)->subOp =
+ NV50_IR_SUBOP_XMAD_PSL | NV50_IR_SUBOP_XMAD_H1(0);
+
+ return true;
+ }
+
return false;
}
More information about the mesa-commit
mailing list