[Mesa-dev] [PATCH 3/3] i965/fs: Recognize and emit ld_lz, sample_lz, sample_c_lz.

Matt Turner mattst88 at gmail.com
Wed May 4 22:54:14 UTC 2016


Ken suggested instead of a big and complicated optimization pass, to
just recognize the operations here. It's certainly less code and a lot
prettier, but it seems to actually perform worse for currently unknown
reasons.

total instructions in shared programs: 8514403 -> 8495373 (-0.22%)
instructions in affected programs: 809512 -> 790482 (-2.35%)
helped: 3316
HURT: 10

total cycles in shared programs: 64259830 -> 63979404 (-0.44%)
cycles in affected programs: 10511460 -> 10231034 (-2.67%)
helped: 2339
HURT: 771

total spills in shared programs: 1707 -> 1695 (-0.70%)
spills in affected programs: 90 -> 78 (-13.33%)
helped: 4

total fills in shared programs: 2647 -> 2620 (-1.02%)
fills in affected programs: 174 -> 147 (-15.52%)
helped: 4

LOST:   12
GAINED: 36
---
Here's a word diff of the shader-db results from the previous series (before)
compared with this series (after)

diff --git a/before b/after
index e856a62..e2de982 100644
--- a/before
+++ b/after
@@ -1,27 +1,27 @@
total instructions in shared programs: [-8522163-]{+8514403+} -> [-8502879 (-0.23%)-]{+8495373 (-0.22%)+}
instructions in affected programs: [-801772-]{+809512+} -> [-782488 (-2.41%)-]{+790482 (-2.35%)+}
helped: 3316
HURT: [-0-]{+10+}

total cycles in shared programs: [-64374090-]{+64259830+} -> [-64070042 (-0.47%)-]{+63979404 (-0.44%)+}
cycles in affected programs: [-10100336-]{+10511460+} -> [-9796288 (-3.01%)-]{+10231034 (-2.67%)+}
helped: [-2351-]{+2339+}
HURT: [-735-]{+771+}

total loops in shared programs: [-2209-]{+2199+} -> [-2209-]{+2199+} (0.00%)
loops in affected programs: 0 -> 0
helped: 0
HURT: 0

total spills in shared programs: 1707 -> [-1683 (-1.41%)-]{+1695 (-0.70%)+}
spills in affected programs: 90 -> [-66 (-26.67%)-]{+78 (-13.33%)+}
helped: 4
HURT: 0

total fills in shared programs: 2647 -> [-2606 (-1.55%)-]{+2620 (-1.02%)+}
fills in affected programs: 174 -> [-133 (-23.56%)-]{+147 (-15.52%)+}
helped: 4
HURT: 0

LOST:   [-2-]{+12+}
GAINED: 36


 src/mesa/drivers/dri/i965/brw_fs.cpp | 12 ++++++++++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp b/src/mesa/drivers/dri/i965/brw_fs.cpp
index d233265..202e0f7 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -4096,6 +4096,10 @@ lower_sampler_logical_send_gen7(const fs_builder &bld, fs_inst *inst, opcode op,
    switch (op) {
    case FS_OPCODE_TXB:
    case SHADER_OPCODE_TXL:
+      if (devinfo->gen >= 9 && op == SHADER_OPCODE_TXL && lod.is_zero()) {
+         op = SHADER_OPCODE_TXL_LZ;
+         break;
+      }
       bld.MOV(sources[length], lod);
       length++;
       break;
@@ -4147,8 +4151,12 @@ lower_sampler_logical_send_gen7(const fs_builder &bld, fs_inst *inst, opcode op,
          length++;
       }
 
-      bld.MOV(retype(sources[length], BRW_REGISTER_TYPE_D), lod);
-      length++;
+      if (devinfo->gen >= 9 && lod.is_zero()) {
+         op = SHADER_OPCODE_TXF_LZ;
+      } else {
+         bld.MOV(retype(sources[length], BRW_REGISTER_TYPE_D), lod);
+         length++;
+      }
 
       for (unsigned i = devinfo->gen >= 9 ? 2 : 1; i < coord_components; i++) {
          bld.MOV(retype(sources[length], BRW_REGISTER_TYPE_D), coordinate);
-- 
2.7.3



More information about the mesa-dev mailing list