[Mesa-dev] [PATCH] nv50/ir: swap the least-ref'd source into src1 when both const/imm

Ilia Mirkin imirkin at alum.mit.edu
Mon Jan 18 01:14:34 PST 2016


The whole point of inlining sources is to reduce loads. We can end up in
a situation where one value is used a lot of times, and one value is
used only once per instruction. The once-per-instruction one is the one
that should get inlined, but with the previous algorithm, it was given
no preference.

This flips things around to preferring putting less-referenced values
into src1 which increases the likelihood of them being inlined.

While we're at it, adjust the heuristic to not treat 0 as an immediate,
as well as (effectively) check for situations where LIMMs can't be
loaded. All this yields improvements on nvc0:

total instructions in shared programs : 6261157 -> 6255985 (-0.08%)
total gprs used in shared programs    : 945082 -> 943417 (-0.18%)
total local used in shared programs   : 30372 -> 30288 (-0.28%)
total bytes used in shared programs   : 50089256 -> 50047880 (-0.08%)

                local        gpr       inst      bytes
    helped          21         822        3332        3332
      hurt           0         278         565         565

And more importantly avoids generating really bad code with SSBOs, where
we end up checking a lot of different values (usually immediates) against
the length.

On nv50 the effect is less pronounced, although it does seem to, on
average, pack the instructions better (instructions went up but bytes
went down):

total instructions in shared programs : 6346564 -> 6347676 (0.02%)
total gprs used in shared programs    : 728719 -> 725883 (-0.39%)
total local used in shared programs   : 3552 -> 3552 (0.00%)
total bytes used in shared programs   : 43995688 -> 43977840 (-0.04%)

                local        gpr       inst      bytes
    helped           0        1326        3111        3677
      hurt           0         868        3572        3108

Signed-off-by: Ilia Mirkin <imirkin at alum.mit.edu>
---
 .../drivers/nouveau/codegen/nv50_ir_peephole.cpp   | 26 +++++++++++++---------
 1 file changed, 16 insertions(+), 10 deletions(-)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
index f5c590e..4d82938 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
@@ -171,7 +171,10 @@ LoadPropagation::isImmdLoad(Instruction *ld)
    if (!ld || (ld->op != OP_MOV) ||
        ((typeSizeof(ld->dType) != 4) && (typeSizeof(ld->dType) != 8)))
       return false;
-   return ld->src(0).getFile() == FILE_IMMEDIATE;
+
+   // A 0 can be replaced with a register, so it doesn't count as an immediate.
+   ImmediateValue val;
+   return ld->src(0).getImmediate(val) && !val.isInteger(0);
 }
 
 bool
@@ -187,7 +190,8 @@ LoadPropagation::isAttribOrSharedLoad(Instruction *ld)
 void
 LoadPropagation::checkSwapSrc01(Instruction *insn)
 {
-   if (!prog->getTarget()->getOpInfo(insn).commutative)
+   const Target *targ = prog->getTarget();
+   if (!targ->getOpInfo(insn).commutative)
       if (insn->op != OP_SET && insn->op != OP_SLCT)
          return;
    if (insn->src(1).getFile() != FILE_GPR)
@@ -196,14 +200,16 @@ LoadPropagation::checkSwapSrc01(Instruction *insn)
    Instruction *i0 = insn->getSrc(0)->getInsn();
    Instruction *i1 = insn->getSrc(1)->getInsn();
 
-   if (isCSpaceLoad(i0)) {
-      if (!isCSpaceLoad(i1))
-         insn->swapSources(0, 1);
-      else
-         return;
-   } else
-   if (isImmdLoad(i0)) {
-      if (!isCSpaceLoad(i1) && !isImmdLoad(i1))
+   // Swap sources to inline the less frequently used source. That way,
+   // optimistically, it will eventually be able to remove the instruction.
+   int i0refs = insn->getSrc(0)->refCount();
+   int i1refs = insn->getSrc(1)->refCount();
+
+   if (isCSpaceLoad(i0) || isImmdLoad(i0)) {
+      if (targ->insnCanLoad(insn, 1, i0) && (
+                (!isImmdLoad(i1) && !isCSpaceLoad(i1)) ||
+                !targ->insnCanLoad(insn, 1, i1) ||
+                i0refs < i1refs))
          insn->swapSources(0, 1);
       else
          return;
-- 
2.4.10



More information about the mesa-dev mailing list