[Mesa-dev] [PATCH] glsl: Optimize AND/OR reductions used in vec4 comparisons.

Fri May 10 20:44:02 PDT 2013

On Fri, May 10, 2013 at 2:49 PM, Ian Romanick <idr at freedesktop.org> wrote:
> On 05/08/2013 07:21 PM, Matt Turner wrote:
>>
>> For vec4 equality comparisons we emit
>>
>>     (and (and (and a b) c) d)
>
>
> Refresh my memory... what generates this?  Is this from the compiler itself,
> or are people writing this code by hand?  On old, vec4-centric
> architectures, this sort of thing was usually done with a dot-product. I'm
> assuming this showed up in some shader on a scalar architecture...

Right, for

vec4 a, b;
if (a == b) ...

our fs backend generates 4x CMP, followed by 3x ANDs (and another AND with 0x1).

cmp.e.f0(8)     g7<1>D          g2.7<0,1,0>F    g2.3<0,1,0>F
cmp.e.f0(8)     g8<1>D          g2.6<0,1,0>F    g2.2<0,1,0>F
cmp.e.f0(8)     g9<1>D          g2.5<0,1,0>F    g2.1<0,1,0>F
cmp.e.f0(8)     g10<1>D         g2.4<0,1,0>F    g2<0,1,0>F
and(8)          g11<1>D         g9<8,8,1>D      g10<8,8,1>D
and(8)          g12<1>D         g8<8,8,1>D      g11<8,8,1>D
and(8)          g13<1>D         g7<8,8,1>D      g12<8,8,1>D
and.ne.f0(8)    null            g13<8,8,1>D     1D

The more I've thought about it, the more I'm not in love with this
patch. Since we're already stalling on f0 for each CMP we should just
predicate all but the first on (+f0) and skip the ANDs entirely.