Mesa (main): intel/fs: Remove condition-based restriction for cmod propagation to saturated operations

Mon Aug 30 21:11:44 UTC 2021

Module: Mesa
Branch: main
Commit: e6373923a764e11b6462de6abea985061ed394ba
URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=e6373923a764e11b6462de6abea985061ed394ba

Author: Ian Romanick <ian.d.romanick at intel.com>
Date:   Fri Jul 16 18:15:50 2021 -0700

intel/fs: Remove condition-based restriction for cmod propagation to saturated operations

I don't know why the float_saturate_l_mov test was #if'ed out, but it
passes... so this commit enables it.

No shader-db or fossil-db changes.

In a previous iteration of this MR, this commit helped ~200 shaders in
shader-db.  Now all of those same shaders are helped by "intel/fs: cmod
propagate from MOV with any condition".  All of these shaders come from
Mad Max.  After initial translation from NIR to assembly, these shader
contain patterns like:

    mul(8)            g90<1>F       g88<8,8,1>F     0x40400000F  /* 3F */
    ...
    mov.sat(8)        g90<1>F       g90<8,8,1>F
    ...
    cmp.nz.f0(8)      null<1>F      g90<8,8,1>F     0 /* 0F */

An initial pass of cmod propagation converts this to

    mul(8)            g90<1>F       g88<8,8,1>F     0x40400000F  /* 3F */
    ...
    mov.sat.XX.f0(8)  g90<1>F       g90<8,8,1>F

Without this commit, XX is G.  With this commit, XX is NZ.  Saturate
propagation moves the saturate:

    mul.sat(8)        g90<1>F       g88<8,8,1>F     0x40400000F  /* 3F */
    ...
    mov.XX.f0(8)      g90<1>F       g90<8,8,1>F

Without this commit (but with "intel/fs: cmod propagate from MOV with
any condition"), the G gets propagated:

    mul.sat.g.f0(8)   g90<1>F       g88<8,8,1>F     0x40400000F  /* 3F */

With this commit (with or without "intel/fs: cmod propagate from MOV
with any condition"), the NZ gets propagated:

    mul.sat.nz.f0(8)  g90<1>F       g88<8,8,1>F     0x40400000F  /* 3F */

Reviewed-by: Marcin Ślusarz <marcin.slusarz at intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12045>

---

 src/intel/compiler/brw_fs_cmod_propagation.cpp  | 33 ++++------
 src/intel/compiler/test_fs_cmod_propagation.cpp | 82 ++++++++++++-------------
 2 files changed, 52 insertions(+), 63 deletions(-)

diff --git a/src/intel/compiler/brw_fs_cmod_propagation.cpp b/src/intel/compiler/brw_fs_cmod_propagation.cpp
index 3303bc8659c..4bc76caf1ed 100644
--- a/src/intel/compiler/brw_fs_cmod_propagation.cpp
+++ b/src/intel/compiler/brw_fs_cmod_propagation.cpp
@@ -499,32 +499,28 @@ opt_cmod_propagation_local(const intel_device_info *devinfo, bblock_t *block)
                inst->src[0].negate ? brw_swap_cmod(inst->conditional_mod)
                                    : inst->conditional_mod;
 
-            /* From the Sky Lake PRM Vol. 7 "Assigning Conditional Mods":
+            /* From the Kaby Lake PRM Vol. 7 "Assigning Conditional Flags":
              *
              *    * Note that the [post condition signal] bits generated at
              *      the output of a compute are before the .sat.
              *
-             * This limits the cases where we can propagate the conditional
-             * modifier.  If scan_inst has a saturate modifier, then we can
-             * only propagate from inst if inst is 'scan_inst <= 0',
-             * 'scan_inst == 0', 'scan_inst != 0', or 'scan_inst > 0'.  If
-             * inst is 'scan_inst == 0', the conditional modifier must be
-             * replace with LE.  Likewise, if inst is 'scan_inst != 0', the
-             * conditional modifier must be replace with G.
+             * Paragraph about post_zero does not mention saturation, but
+             * testing it on actual GPUs shows that conditional modifiers are
+             * applied after saturation.
              *
-             * The only other cases are 'scan_inst < 0' (which is a
-             * contradiction) and 'scan_inst >= 0' (which is a tautology).
+             *    * post_zero bit: This bit reflects whether the final
+             *      result is zero after all the clamping, normalizing,
+             *      or format conversion logic.
+             *
+             * Conditional modifiers can be applied to floating point
+             * calculations without restriction.  Some cases (e.g., .L) are
+             * nonsensical, but those should have been eliminated by the NIR
+             * optimizer.
              */
             if (scan_inst->saturate) {
                if (scan_inst->dst.type != BRW_REGISTER_TYPE_F)
                   break;
 
-               if (cond != BRW_CONDITIONAL_Z &&
-                   cond != BRW_CONDITIONAL_NZ &&
-                   cond != BRW_CONDITIONAL_LE &&
-                   cond != BRW_CONDITIONAL_G)
-                  break;
-
                assert(inst->opcode == BRW_OPCODE_MOV ||
                       inst->opcode == BRW_OPCODE_CMP);
 
@@ -532,11 +528,6 @@ opt_cmod_propagation_local(const intel_device_info *devinfo, bblock_t *block)
                 * against possible future changes in this code.
                 */
                assert(inst->opcode != BRW_OPCODE_CMP || inst->src[1].is_zero());
-
-               if (cond == BRW_CONDITIONAL_Z)
-                  cond = BRW_CONDITIONAL_LE;
-               else if (cond == BRW_CONDITIONAL_NZ)
-                  cond = BRW_CONDITIONAL_G;
             }
 
             /* Otherwise, try propagating the conditional. */
diff --git a/src/intel/compiler/test_fs_cmod_propagation.cpp b/src/intel/compiler/test_fs_cmod_propagation.cpp
index 3bff88ec973..c7b71131940 100644
--- a/src/intel/compiler/test_fs_cmod_propagation.cpp
+++ b/src/intel/compiler/test_fs_cmod_propagation.cpp
@@ -2187,8 +2187,8 @@ cmod_propagation_test::test_saturate_prop(enum brw_conditional_mod before,
 
 TEST_F(cmod_propagation_test, float_saturate_nz_cmp)
 {
-   /* With the saturate modifier, the comparison happens before clamping to
-    * [0, 1].  (sat(x) != 0) == (x > 0).
+   /* With the saturate modifier, the comparison happens after clamping to
+    * [0, 1].
     *
     * = Before =
     *
@@ -2196,9 +2196,9 @@ TEST_F(cmod_propagation_test, float_saturate_nz_cmp)
     * 1: cmp.nz.f0(8)  null  dest  0.0f
     *
     * = After =
-    * 0: add.sat.g.f0(8)  dest  src0  src1
+    * 0: add.sat.nz.f0(8)  dest  src0  src1
     */
-   test_saturate_prop(BRW_CONDITIONAL_NZ, BRW_CONDITIONAL_G,
+   test_saturate_prop(BRW_CONDITIONAL_NZ, BRW_CONDITIONAL_NZ,
                       BRW_OPCODE_CMP,
                       BRW_REGISTER_TYPE_F, BRW_REGISTER_TYPE_F,
                       true);
@@ -2206,8 +2206,8 @@ TEST_F(cmod_propagation_test, float_saturate_nz_cmp)
 
 TEST_F(cmod_propagation_test, float_saturate_nz_mov)
 {
-   /* With the saturate modifier, the comparison happens before clamping to
-    * [0, 1].  (sat(x) != 0) == (x > 0).
+   /* With the saturate modifier, the comparison happens after clamping to
+    * [0, 1].
     *
     * = Before =
     *
@@ -2215,9 +2215,9 @@ TEST_F(cmod_propagation_test, float_saturate_nz_mov)
     * 1: mov.nz.f0(8)  null  dest
     *
     * = After =
-    * 0: add.sat.g.f0(8)  dest  src0  src1
+    * 0: add.sat.nz.f0(8)  dest  src0  src1
     */
-   test_saturate_prop(BRW_CONDITIONAL_NZ, BRW_CONDITIONAL_G,
+   test_saturate_prop(BRW_CONDITIONAL_NZ, BRW_CONDITIONAL_NZ,
                       BRW_OPCODE_MOV,
                       BRW_REGISTER_TYPE_F, BRW_REGISTER_TYPE_F,
                       true);
@@ -2225,8 +2225,8 @@ TEST_F(cmod_propagation_test, float_saturate_nz_mov)
 
 TEST_F(cmod_propagation_test, float_saturate_z_cmp)
 {
-   /* With the saturate modifier, the comparison happens before clamping to
-    * [0, 1].  (sat(x) == 0) == (x <= 0).
+   /* With the saturate modifier, the comparison happens after clamping to
+    * [0, 1].
     *
     * = Before =
     *
@@ -2234,9 +2234,9 @@ TEST_F(cmod_propagation_test, float_saturate_z_cmp)
     * 1: cmp.z.f0(8)   null  dest  0.0f
     *
     * = After =
-    * 0: add.sat.le.f0(8)  dest  src0  src1
+    * 0: add.sat.z.f0(8)  dest  src0  src1
     */
-   test_saturate_prop(BRW_CONDITIONAL_Z, BRW_CONDITIONAL_LE,
+   test_saturate_prop(BRW_CONDITIONAL_Z, BRW_CONDITIONAL_Z,
                       BRW_OPCODE_CMP,
                       BRW_REGISTER_TYPE_F, BRW_REGISTER_TYPE_F,
                       true);
@@ -2244,8 +2244,8 @@ TEST_F(cmod_propagation_test, float_saturate_z_cmp)
 
 TEST_F(cmod_propagation_test, float_saturate_z_mov)
 {
-   /* With the saturate modifier, the comparison happens before clamping to
-    * [0, 1].  (sat(x) == 0) == (x <= 0).
+   /* With the saturate modifier, the comparison happens after clamping to
+    * [0, 1].
     *
     * = Before =
     *
@@ -2253,9 +2253,9 @@ TEST_F(cmod_propagation_test, float_saturate_z_mov)
     * 1: mov.z.f0(8)   null  dest
     *
     * = After =
-    * 0: add.sat.le.f0(8)  dest  src0  src1
+    * 0: add.sat.z.f0(8) dest  src0  src1
     */
-   test_saturate_prop(BRW_CONDITIONAL_Z, BRW_CONDITIONAL_LE,
+   test_saturate_prop(BRW_CONDITIONAL_Z, BRW_CONDITIONAL_Z,
                       BRW_OPCODE_MOV,
                       BRW_REGISTER_TYPE_F, BRW_REGISTER_TYPE_F,
                       true);
@@ -2263,8 +2263,8 @@ TEST_F(cmod_propagation_test, float_saturate_z_mov)
 
 TEST_F(cmod_propagation_test, float_saturate_g_cmp)
 {
-   /* With the saturate modifier, the comparison happens before clamping to
-    * [0, 1].  (sat(x) > 0) == (x > 0).
+   /* With the saturate modifier, the comparison happens after clamping to
+    * [0, 1].
     *
     * = Before =
     *
@@ -2282,8 +2282,8 @@ TEST_F(cmod_propagation_test, float_saturate_g_cmp)
 
 TEST_F(cmod_propagation_test, float_saturate_g_mov)
 {
-   /* With the saturate modifier, the comparison happens before clamping to
-    * [0, 1].  (sat(x) > 0) == (x > 0).
+   /* With the saturate modifier, the comparison happens after clamping to
+    * [0, 1].
     *
     * = Before =
     *
@@ -2301,8 +2301,8 @@ TEST_F(cmod_propagation_test, float_saturate_g_mov)
 
 TEST_F(cmod_propagation_test, float_saturate_le_cmp)
 {
-   /* With the saturate modifier, the comparison happens before clamping to
-    * [0, 1].  (sat(x) <= 0) == (x <= 0).
+   /* With the saturate modifier, the comparison happens after clamping to
+    * [0, 1].
     *
     * = Before =
     *
@@ -2320,7 +2320,7 @@ TEST_F(cmod_propagation_test, float_saturate_le_cmp)
 
 TEST_F(cmod_propagation_test, float_saturate_le_mov)
 {
-   /* With the saturate modifier, the comparison happens before clamping to
+   /* With the saturate modifier, the comparison happens after clamping to
     * [0, 1].  (sat(x) <= 0) == (x <= 0).
     *
     * = Before =
@@ -2339,8 +2339,8 @@ TEST_F(cmod_propagation_test, float_saturate_le_mov)
 
 TEST_F(cmod_propagation_test, float_saturate_l_cmp)
 {
-   /* With the saturate modifier, the comparison happens before clamping to
-    * [0, 1].  There is no before / after equivalence for (sat(x) < 0).
+   /* With the saturate modifier, the comparison happens after clamping to
+    * [0, 1].
     *
     * = Before =
     *
@@ -2348,39 +2348,37 @@ TEST_F(cmod_propagation_test, float_saturate_l_cmp)
     * 1: cmp.l.f0(8)  null  dest  0.0f
     *
     * = After =
-    * No change
+    * 0: add.sat.l.f0(8)  dest  src0  src1
     */
    test_saturate_prop(BRW_CONDITIONAL_L, BRW_CONDITIONAL_L,
                       BRW_OPCODE_CMP,
                       BRW_REGISTER_TYPE_F, BRW_REGISTER_TYPE_F,
-                      false);
+                      true);
 }
 
-#if 0
 TEST_F(cmod_propagation_test, float_saturate_l_mov)
 {
-   /* With the saturate modifier, the comparison happens before clamping to
-    * [0, 1].  There is no before / after equivalence for (sat(x) < 0).
+   /* With the saturate modifier, the comparison happens after clamping to
+    * [0, 1].
     *
     * = Before =
     *
     * 0: add.sat(8)    dest  src0  src1
-    * 1: mov.l.f0(8)  null  dest  0.0f
+    * 1: mov.l.f0(8)   null  dest
     *
     * = After =
-    * No change
+    * 0: add.sat.l.f0(8)    dest  src0  src1
     */
    test_saturate_prop(BRW_CONDITIONAL_L, BRW_CONDITIONAL_L,
                       BRW_OPCODE_MOV,
                       BRW_REGISTER_TYPE_F, BRW_REGISTER_TYPE_F,
-                      false);
+                      true);
 }
-#endif
 
 TEST_F(cmod_propagation_test, float_saturate_ge_cmp)
 {
-   /* With the saturate modifier, the comparison happens before clamping to
-    * [0, 1].  There is no before / after equivalence for (sat(x) >= 0).
+   /* With the saturate modifier, the comparison happens after clamping to
+    * [0, 1].
     *
     * = Before =
     *
@@ -2388,31 +2386,31 @@ TEST_F(cmod_propagation_test, float_saturate_ge_cmp)
     * 1: cmp.ge.f0(8)  null  dest  0.0f
     *
     * = After =
-    * No change
+    * 0: add.sat.ge.f0(8)  dest  src0  src1
     */
    test_saturate_prop(BRW_CONDITIONAL_GE, BRW_CONDITIONAL_GE,
                       BRW_OPCODE_CMP,
                       BRW_REGISTER_TYPE_F, BRW_REGISTER_TYPE_F,
-                      false);
+                      true);
 }
 
 TEST_F(cmod_propagation_test, float_saturate_ge_mov)
 {
    /* With the saturate modifier, the comparison happens before clamping to
-    * [0, 1].  There is no before / after equivalence for (sat(x) >= 0).
+    * [0, 1].
     *
     * = Before =
     *
     * 0: add.sat(8)    dest  src0  src1
-    * 1: mov.ge.f0(8)  null  dest  0.0f
+    * 1: mov.ge.f0(8)  null  dest
     *
     * = After =
-    * No change
+    * 0: add.sat.ge.f0(8)    dest  src0  src1
     */
    test_saturate_prop(BRW_CONDITIONAL_GE, BRW_CONDITIONAL_GE,
                       BRW_OPCODE_MOV,
                       BRW_REGISTER_TYPE_F, BRW_REGISTER_TYPE_F,
-                      false);
+                      true);
 }
 
 TEST_F(cmod_propagation_test, int_saturate_nz_cmp)