Mesa (main): nir/range_analysis: Teach range analysis about fdot opcodes

GitLab Mirror gitlab-mirror at kemper.freedesktop.org
Thu Jun 23 20:08:54 UTC 2022


Module: Mesa
Branch: main
Commit: 6689fa2ab4eae15fbd73bba250f42b3fe3b50a3f
URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=6689fa2ab4eae15fbd73bba250f42b3fe3b50a3f

Author: Ian Romanick <ian.d.romanick at intel.com>
Date:   Tue Jun 21 16:47:31 2022 -0700

nir/range_analysis: Teach range analysis about fdot opcodes

This really, really helps on platforms where fabs() isn't free.  A great
many shaders use a * frsq(fabs(fdot(a, a))) to normalize a vector.
Since the result of the fdot must be non-negative, the fabs can be
eliminated by an existing algebraic rule.

shader-db results:

r300 (run on R420 - X800XL)
total instructions in shared programs: 1369807 -> 1368550 (-0.09%)
instructions in affected programs: 59986 -> 58729 (-2.10%)
helped: 609
HURT: 0

total vinst in shared programs: 512899 -> 512861 (<.01%)
vinst in affected programs: 1522 -> 1484 (-2.50%)
helped: 36
HURT: 0

total sinst in shared programs: 260690 -> 260570 (-0.05%)
sinst in affected programs: 1419 -> 1299 (-8.46%)
helped: 120
HURT: 0

total consts in shared programs: 957295 -> 957230 (<.01%)
consts in affected programs: 849 -> 784 (-7.66%)
helped: 65
HURT: 0

LOST:   0
GAINED: 3

The 3 gained shaders are all vertex shaders from XCom: Enemy Unknown.
I'm guessing that game is never going to run on my X800XL. :)

i915
total instructions in shared programs: 791121 -> 780843 (-1.30%)
instructions in affected programs: 220170 -> 209892 (-4.67%)
helped: 2085
HURT: 0

total temps in shared programs: 47765 -> 47766 (<.01%)
temps in affected programs: 9 -> 10 (11.11%)
helped: 0
HURT: 1

total const in shared programs: 93048 -> 92983 (-0.07%)
const in affected programs: 784 -> 719 (-8.29%)
helped: 65
HURT: 0

LOST:   0
GAINED: 36

Haswell, Ivy Bridge, and Sandy Bridge had similar results. (Haswell shown)
total instructions in shared programs: 16702250 -> 16697908 (-0.03%)
instructions in affected programs: 119277 -> 114935 (-3.64%)
helped: 1065
HURT: 0
helped stats (abs) min: 1 max: 20 x̄: 4.08 x̃: 4
helped stats (rel) min: 0.48% max: 10.17% x̄: 3.66% x̃: 3.94%
95% mean confidence interval for instructions value: -4.26 -3.89
95% mean confidence interval for instructions %-change: -3.76% -3.56%
Instructions are helped.

total cycles in shared programs: 880772068 -> 880734134 (<.01%)
cycles in affected programs: 2134456 -> 2096522 (-1.78%)
helped: 941
HURT: 324
helped stats (abs) min: 2 max: 2180 x̄: 123.06 x̃: 44
helped stats (rel) min: 0.04% max: 49.96% x̄: 7.08% x̃: 3.81%
HURT stats (abs)   min: 2 max: 2098 x̄: 240.33 x̃: 35
HURT stats (rel)   min: 0.04% max: 77.07% x̄: 12.34% x̃: 3.00%
95% mean confidence interval for cycles value: -47.93 -12.04
95% mean confidence interval for cycles %-change: -2.87% -1.34%
Cycles are helped.

No shader-db changes on any other Intel platform.

Reviewed-by: Jason Ekstrand <jason.ekstrand at collabora.com>
Reviewed-by: Emma Anholt <emma at anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17181>

---

 src/compiler/nir/nir_range_analysis.c | 31 +++++++++++++++++++++++++++++++
 1 file changed, 31 insertions(+)

diff --git a/src/compiler/nir/nir_range_analysis.c b/src/compiler/nir/nir_range_analysis.c
index 46a7dc8b469..700159be2d4 100644
--- a/src/compiler/nir/nir_range_analysis.c
+++ b/src/compiler/nir/nir_range_analysis.c
@@ -1046,6 +1046,37 @@ analyze_expression(const nir_alu_instr *instr, unsigned src,
       r = (struct ssa_result_range){le_zero, false, true, false};
       break;
 
+   case nir_op_fdot2:
+   case nir_op_fdot3:
+   case nir_op_fdot4:
+   case nir_op_fdot8:
+   case nir_op_fdot16:
+   case nir_op_fdot2_replicated:
+   case nir_op_fdot3_replicated:
+   case nir_op_fdot4_replicated:
+   case nir_op_fdot8_replicated:
+   case nir_op_fdot16_replicated: {
+      const struct ssa_result_range left =
+         analyze_expression(alu, 0, ht, nir_alu_src_type(alu, 0));
+
+      /* If the two sources are the same SSA value, then the result is either
+       * NaN or some number >= 0.  If one source is the negation of the other,
+       * the result is either NaN or some number <= 0.
+       *
+       * In either of these two cases, if one source is a number, then the
+       * other must also be a number.  Since it should not be possible to get
+       * Inf-Inf in the dot-product, the result must also be a number.
+       */
+      if (nir_alu_srcs_equal(alu, alu, 0, 1)) {
+         r = (struct ssa_result_range){ge_zero, false, left.is_a_number, false };
+      } else if (nir_alu_srcs_negative_equal(alu, alu, 0, 1)) {
+         r = (struct ssa_result_range){le_zero, false, left.is_a_number, false };
+      } else {
+         r = (struct ssa_result_range){unknown, false, false, false};
+      }
+      break;
+   }
+
    case nir_op_fpow: {
       /* Due to flush-to-zero semanatics of floating-point numbers with very
        * small mangnitudes, we can never really be sure a result will be



More information about the mesa-commit mailing list