[Mesa-dev] [PATCH 7/8] nv50/ir: optimize ADD3(d, a, b, c) to ADD(d, c, a + b)

Samuel Pitoiset samuel.pitoiset at gmail.com
Thu Jun 30 22:26:57 UTC 2016


This is similar to what we already do for MAD/FMA.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset at gmail.com>
---
 src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp | 11 ++++++++++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
index 517f779..552672f 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
@@ -567,6 +567,14 @@ ConstantFolding::expr(Instruction *i,
          return;
       }
       break;
+   case OP_ADD3:
+      switch (i->dType) {
+      case TYPE_S32:
+      case TYPE_U32: res.data.u32 = a->data.u32 + b->data.u32; break;
+      default:
+         return;
+      }
+      break;
    case OP_POW:
       switch (i->dType) {
       case TYPE_F32: res.data.f32 = pow(a->data.f32, b->data.f32); break;
@@ -683,7 +691,8 @@ ConstantFolding::expr(Instruction *i,
 
    switch (i->op) {
    case OP_MAD:
-   case OP_FMA: {
+   case OP_FMA:
+   case OP_ADD3: {
       ImmediateValue src0, src1 = *i->getSrc(0)->asImm();
 
       // Move the immediate into position 1, where we know it might be
-- 
2.8.3



More information about the mesa-dev mailing list