[Mesa-dev] [PATCH v2 29/29] nir/algebraic: Add some optimizations for D3D-style Booleans

Tue Dec 18 06:18:03 UTC 2018

On 7/12/18 6:45 am, Jason Ekstrand wrote:
> D3D Booleans use a 32-bit 0/-1 representation.  Because this previously
> matched NIR exactly, we didn't have to really optimize for it.  Now that
> we have 1-bit Booleans, we need some specific optimizations to chew
> through the D3D12-style Booleans.
> 
> Shader-db results on Kaby Lake:
> 
>      total instructions in shared programs: 15136811 -> 14967944 (-1.12%)
>      instructions in affected programs: 2457021 -> 2288154 (-6.87%)
>      helped: 8318
>      HURT: 10
> 
>      total cycles in shared programs: 373544524 -> 359701825 (-3.71%)
>      cycles in affected programs: 151029683 -> 137186984 (-9.17%)
>      helped: 7749
>      HURT: 682
> 
>      total loops in shared programs: 4431 -> 4399 (-0.72%)
>      loops in affected programs: 32 -> 0
>      helped: 21
>      HURT: 0
> 
>      total spills in shared programs: 10290 -> 10051 (-2.32%)
>      spills in affected programs: 2532 -> 2293 (-9.44%)
>      helped: 18
>      HURT: 18
> 
>      total fills in shared programs: 22203 -> 21732 (-2.12%)
>      fills in affected programs: 3319 -> 2848 (-14.19%)
>      helped: 18
>      HURT: 18
> 
> Note that a large chunk of the improvement fixing regressions caused by
> switching to 1-bit Booleans.  Previously, our ability to optimize D3D
> booleans was improved by using the D3D representation directly in NIR.
> How that NIR does 1-bit bools, we need a few more optimizations.
> 
> Reviewed-by: Bas Nieuwenhuizen <bas at basnieuwenhuizen.nl>
> ---
>   src/compiler/nir/nir_opt_algebraic.py | 13 +++++++++++++
>   1 file changed, 13 insertions(+)
> 
> diff --git a/src/compiler/nir/nir_opt_algebraic.py b/src/compiler/nir/nir_opt_algebraic.py
> index 3c8af4692b5..506d45e55b5 100644
> --- a/src/compiler/nir/nir_opt_algebraic.py
> +++ b/src/compiler/nir/nir_opt_algebraic.py
> @@ -534,6 +534,19 @@ optimizations = [
>      (('bcsel', a, b, b), b),
>      (('fcsel', a, b, b), b),
>   
> +   # D3D Boolean emulation
> +   (('bcsel', a, -1, 0), ('ineg', ('b2i', 'a at 1'))),
> +   (('bcsel', a, 0, -1), ('ineg', ('b2i', ('inot', a)))),
> +   (('iand', ('ineg', ('b2i', 'a at 1')), ('ineg', ('b2i', 'b at 1'))),
> +    ('ineg', ('b2i', ('iand', a, b)))),
> +   (('ior', ('ineg', ('b2i','a at 1')), ('ineg', ('b2i', 'b at 1'))),
> +    ('ineg', ('b2i', ('ior', a, b)))),
> +   (('ieq', ('ineg', ('b2i', 'a at 1')), 0), ('inot', a)),
> +   (('ieq', ('ineg', ('b2i', 'a at 1')), -1), a),
> +   (('ine', ('ineg', ('b2i', 'a at 1')), 0), a),
> +   (('ine', ('ineg', ('b2i', 'a at 1')), -1), ('inot', a)),
> +   (('iand', ('ineg', ('b2i', a)), '1.0 at 32'), ('b2f', a)),

Hi Jason,

It seems the '1.0 at 32' matching has been broken somewhere along the line 
in the recent rewrites. See my comments in the RADV bug for more info.

https://bugs.freedesktop.org/show_bug.cgi?id=109075

> +
>      # Conversions
>      (('i2b32', ('b2i', 'a at 32')), a),
>      (('f2i', ('ftrunc', a)), ('f2i', a)),
>