[Mesa-dev] [PATCH 4/5] i965/fs: Relax type matching rules in cmod propagation from MOV instructions

Ian Romanick idr at freedesktop.org
Wed Aug 29 18:40:29 UTC 2018


From: Ian Romanick <ian.d.romanick at intel.com>

To allow cmod propagation from a MOV in a sequence like:

    and(16)         g31<1>UD       g20<8,8,1>UD   g22<8,8,1>UD
    mov.nz.f0(16)   null<1>F       g31<8,8,1>D

A similar change to the vec4 backend had no effect.

Somewhere between c1ec5820593 and 40fc4b5acd6 (1,094 commits) the
effectiveness of this patch diminished.  Applying this on c1ec5820593
used to help 20 shaders on Gen7+ platforms.  I did not investigate
this further.

The SIMD8 and SIMD16 shaders in two UE4 demos are helped.

Skylake, Ivy Bridge, and Sandy Bridge had similar results. (Skylake shown)
total instructions in shared programs: 14304235 -> 14304227 (<.01%)
instructions in affected programs: 1956 -> 1948 (-0.41%)
helped: 4
HURT: 0
helped stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2
helped stats (rel) min: 0.41% max: 0.41% x̄: 0.41% x̃: 0.41%
95% mean confidence interval for instructions value: -2.00 -2.00
95% mean confidence interval for instructions %-change: -0.41% -0.41%
Instructions are helped.

total cycles in shared programs: 527531092 -> 527530920 (<.01%)
cycles in affected programs: 92474 -> 92302 (-0.19%)
helped: 4
HURT: 0
helped stats (abs) min: 32 max: 54 x̄: 43.00 x̃: 43
helped stats (rel) min: 0.15% max: 0.21% x̄: 0.18% x̃: 0.18%
95% mean confidence interval for cycles value: -63.21 -22.79
95% mean confidence interval for cycles %-change: -0.24% -0.13%
Cycles are helped.

Haswell and Broadwell had similar results. (Broadwell shown)
total instructions in shared programs: 14615704 -> 14615700 (<.01%)
instructions in affected programs: 990 -> 986 (-0.40%)
helped: 2
HURT: 0

total cycles in shared programs: 554530624 -> 554530532 (<.01%)
cycles in affected programs: 42044 -> 41952 (-0.22%)
helped: 2
HURT: 0

No changes on Iron Lake or GM45.

Signed-off-by: Ian Romanick <ian.d.romanick at intel.com>
---
 src/intel/compiler/brw_fs_cmod_propagation.cpp | 23 +++++++++++++++++++----
 1 file changed, 19 insertions(+), 4 deletions(-)

diff --git a/src/intel/compiler/brw_fs_cmod_propagation.cpp b/src/intel/compiler/brw_fs_cmod_propagation.cpp
index 5b74f267359..17abcf05d8a 100644
--- a/src/intel/compiler/brw_fs_cmod_propagation.cpp
+++ b/src/intel/compiler/brw_fs_cmod_propagation.cpp
@@ -248,10 +248,25 @@ opt_cmod_propagation_local(const gen_device_info *devinfo, bblock_t *block)
                break;
 
             /* Comparisons operate differently for ints and floats */
-            if (scan_inst->dst.type != inst->dst.type &&
-                (scan_inst->dst.type == BRW_REGISTER_TYPE_F ||
-                 inst->dst.type == BRW_REGISTER_TYPE_F))
-               break;
+            if (scan_inst->dst.type != inst->dst.type) {
+               /* We should propagate from a MOV to another instruction in a
+                * sequence like:
+                *
+                *    and(16)         g31<1>UD       g20<8,8,1>UD   g22<8,8,1>UD
+                *    mov.nz.f0(16)   null<1>F       g31<8,8,1>D
+                */
+               if (inst->opcode == BRW_OPCODE_MOV) {
+                  if ((inst->src[0].type != BRW_REGISTER_TYPE_D &&
+                       inst->src[0].type != BRW_REGISTER_TYPE_UD) ||
+                      (scan_inst->dst.type != BRW_REGISTER_TYPE_D &&
+                       scan_inst->dst.type != BRW_REGISTER_TYPE_UD)) {
+                     break;
+                  }
+               } else if (scan_inst->dst.type == BRW_REGISTER_TYPE_F ||
+                          inst->dst.type == BRW_REGISTER_TYPE_F) {
+                  break;
+               }
+            }
 
             /* If the instruction generating inst's source also wrote the
              * flag, and inst is doing a simple .nz comparison, then inst
-- 
2.14.4



More information about the mesa-dev mailing list