Mesa (staging/21.3): pan/bi: Lower swizzles on CSEL.i32/MUX.i32

Sun Feb 20 17:40:41 UTC 2022

Module: Mesa
Branch: staging/21.3
Commit: 1a1a4b5667f71db7cccd757940d3c2fbae4b6a56
URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=1a1a4b5667f71db7cccd757940d3c2fbae4b6a56

Author: Alyssa Rosenzweig <alyssa at collabora.com>
Date:   Fri Jul 23 16:49:02 2021 -0400

pan/bi: Lower swizzles on CSEL.i32/MUX.i32

This is counter-intuitive, but required for correct operation when
CSEL.i32 takes a 1-bit (stored 16-bit) boolean argument. The impedance
mismatch ultimately is between CSEL.b32 (nir's bcsel, nonexistant in the
hardware) and the lowering CSEL.i32. However, a similar problem exists
even with MUX.i32 which lacks a good way of zero/sign-extending
booleans.

Cherry-picked from my Valhall branch though the issue also affects
Bifrost. Fixes piglit shaders at glsl-vs-if-bool on Bifrost.

Unfortunately, shader-db is quite unhappy :-(

The proper fix is to use lower_bool_to_bitsize, but that can't be
backported to mesa-stable.

total instructions in shared programs: 157539 -> 158953 (0.90%)
instructions in affected programs: 55621 -> 57035 (2.54%)
helped: 2
HURT: 259
helped stats (abs) min: 2.0 max: 2.0 x̄: 2.00 x̃: 2
helped stats (rel) min: 2.11% max: 2.67% x̄: 2.39% x̃: 2.39%
HURT stats (abs)   min: 1.0 max: 40.0 x̄: 5.47 x̃: 2
HURT stats (rel)   min: 0.36% max: 16.13% x̄: 2.55% x̃: 1.59%
95% mean confidence interval for instructions value: 4.44 6.40
95% mean confidence interval for instructions %-change: 2.21% 2.82%
Instructions are HURT.

total tuples in shared programs: 132322 -> 132907 (0.44%)
tuples in affected programs: 31806 -> 32391 (1.84%)
helped: 5
HURT: 152
helped stats (abs) min: 1.0 max: 2.0 x̄: 1.40 x̃: 1
helped stats (rel) min: 0.39% max: 3.03% x̄: 1.70% x̃: 1.61%
HURT stats (abs)   min: 1.0 max: 42.0 x̄: 3.89 x̃: 2
HURT stats (rel)   min: 0.29% max: 18.18% x̄: 2.50% x̃: 1.79%
95% mean confidence interval for tuples value: 2.88 4.58
95% mean confidence interval for tuples %-change: 1.87% 2.85%
Tuples are HURT.

total clauses in shared programs: 28672 -> 28698 (0.09%)
clauses in affected programs: 869 -> 895 (2.99%)
helped: 1
HURT: 24
helped stats (abs) min: 1.0 max: 1.0 x̄: 1.00 x̃: 1
helped stats (rel) min: 5.88% max: 5.88% x̄: 5.88% x̃: 5.88%
HURT stats (abs)   min: 1.0 max: 2.0 x̄: 1.12 x̃: 1
HURT stats (rel)   min: 0.49% max: 33.33% x̄: 8.46% x̃: 3.59%
95% mean confidence interval for clauses value: 0.82 1.26
95% mean confidence interval for clauses %-change: 3.84% 11.93%
Clauses are HURT.

total cycles in shared programs: 15119.04 -> 15137.88 (0.12%)
cycles in affected programs: 922.87 -> 941.71 (2.04%)
helped: 4
HURT: 79
helped stats (abs) min: 0.0416669999999999 max: 0.0833330000000001 x̄: 0.05 x̃: 0
helped stats (rel) min: 0.40% max: 3.17% x̄: 1.57% x̃: 1.35%
HURT stats (abs)   min: 0.041665999999999315 max: 1.75 x̄: 0.24 x̃: 0
HURT stats (rel)   min: 0.30% max: 20.00% x̄: 2.83% x̃: 2.12%
95% mean confidence interval for cycles value: 0.17 0.29
95% mean confidence interval for cycles %-change: 1.86% 3.37%
Cycles are HURT.

total arith in shared programs: 4922.71 -> 4947.71 (0.51%)
arith in affected programs: 1423.79 -> 1448.79 (1.76%)
helped: 5
HURT: 177
helped stats (abs) min: 0.0416669999999999 max: 0.0833330000000001 x̄: 0.06 x̃: 0
helped stats (rel) min: 0.40% max: 3.17% x̄: 1.82% x̃: 1.67%
HURT stats (abs)   min: 0.041665999999999315 max: 1.75 x̄: 0.14 x̃: 0
HURT stats (rel)   min: 0.30% max: 22.22% x̄: 2.50% x̃: 1.52%
95% mean confidence interval for arith value: 0.11 0.17
95% mean confidence interval for arith %-change: 1.86% 2.90%
Arith are HURT.

total quadwords in shared programs: 120605 -> 120956 (0.29%)
quadwords in affected programs: 26535 -> 26886 (1.32%)
helped: 6
HURT: 143
helped stats (abs) min: 1.0 max: 7.0 x̄: 2.83 x̃: 1
helped stats (rel) min: 0.93% max: 6.33% x̄: 2.29% x̃: 1.71%
HURT stats (abs)   min: 1.0 max: 21.0 x̄: 2.57 x̃: 2
HURT stats (rel)   min: 0.34% max: 13.79% x̄: 2.02% x̃: 1.22%
95% mean confidence interval for quadwords value: 1.86 2.86
95% mean confidence interval for quadwords %-change: 1.45% 2.24%
Quadwords are HURT.

total threads in shared programs: 4670 -> 4669 (-0.02%)
threads in affected programs: 2 -> 1 (-50.00%)
helped: 0
HURT: 1

Signed-off-by: Alyssa Rosenzweig <alyssa at collabora.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14576>
(cherry picked from commit 8bd4976d981a9a98ce7e419b25c05d38ccac027b)

---

 .pick_status.json                       |  2 +-
 src/panfrost/bifrost/bi_lower_swizzle.c | 13 +++++++++++++
 src/panfrost/ci/panfrost-g52-fails.txt  |  7 -------
 3 files changed, 14 insertions(+), 8 deletions(-)

diff --git a/.pick_status.json b/.pick_status.json
index a66aed88978..8de2f09c4af 100644
--- a/.pick_status.json
+++ b/.pick_status.json
@@ -112,7 +112,7 @@
         "description": "pan/bi: Lower swizzles on CSEL.i32/MUX.i32",
         "nominated": true,
         "nomination_type": 0,
-        "resolution": 0,
+        "resolution": 1,
         "main_sha": null,
         "because_sha": null
     },
diff --git a/src/panfrost/bifrost/bi_lower_swizzle.c b/src/panfrost/bifrost/bi_lower_swizzle.c
index a0d5917b218..b7549b0f385 100644
--- a/src/panfrost/bifrost/bi_lower_swizzle.c
+++ b/src/panfrost/bifrost/bi_lower_swizzle.c
@@ -50,6 +50,19 @@ bi_lower_swizzle_16(bi_context *ctx, bi_instr *ins, unsigned src)
          * derivatives, which might require swizzle lowering */
         case BI_OPCODE_CLPER_I32:
         case BI_OPCODE_CLPER_V6_I32:
+
+        /* Similarly, CSEL.i32 consumes a boolean as a 32-bit argument. If the
+         * boolean is implemented as a 16-bit integer, the swizzle is needed
+         * for correct operation if the instruction producing the 16-bit
+         * boolean does not replicate to both halves of the containing 32-bit
+         * register. As such, we may need to lower a swizzle.
+         *
+         * This is a silly hack. Ideally, code gen would be smart enough to
+         * avoid this case (by replicating). In practice, silly hardware design
+         * decisions force our hand here.
+         */
+        case BI_OPCODE_MUX_I32:
+        case BI_OPCODE_CSEL_I32:
             break;
 
         case BI_OPCODE_IADD_V2S16:
diff --git a/src/panfrost/ci/panfrost-g52-fails.txt b/src/panfrost/ci/panfrost-g52-fails.txt
index eb2bd1906ac..7518f80fede 100644
--- a/src/panfrost/ci/panfrost-g52-fails.txt
+++ b/src/panfrost/ci/panfrost-g52-fails.txt
@@ -21,7 +21,6 @@ glx at glx-visuals-stencil -pixmap,Crash
 shaders at glsl-bug-110796,Fail
 shaders at glsl-uniform-interstage-limits@subdivide 5,Crash
 shaders at glsl-uniform-interstage-limits@subdivide 5- statechanges,Crash
-shaders at glsl-vs-if-bool,Fail
 shaders at point-vertex-id divisor,Fail
 shaders at point-vertex-id gl_instanceid divisor,Fail
 shaders at point-vertex-id gl_instanceid,Fail
@@ -112,12 +111,6 @@ spec at arb_pixel_buffer_object@texsubimage cube_map_array pbo,Fail
 spec at arb_point_sprite@arb_point_sprite-checkerboard,Fail
 spec at arb_point_sprite@arb_point_sprite-mipmap,Fail
 spec at arb_provoking_vertex@arb-provoking-vertex-render,Fail
-spec at arb_sample_shading@builtin-gl-sample-id 0,Fail
-spec at arb_sample_shading@builtin-gl-sample-id 2,Fail
-spec at arb_sample_shading@builtin-gl-sample-id 4,Fail
-spec at arb_sample_shading@builtin-gl-sample-mask 0,Fail
-spec at arb_sample_shading@builtin-gl-sample-mask 2,Fail
-spec at arb_sample_shading@builtin-gl-sample-mask 4,Fail
 spec at arb_sample_shading@samplemask 2 at 0.250000 mask_in_one,Fail
 spec at arb_sample_shading@samplemask 2 at 0.500000 mask_in_one,Fail
 spec at arb_sample_shading@samplemask 2 at 1.000000 mask_in_one,Fail