[Mesa-dev] GLSL swizzle lowering

Marek Olšák maraeo at gmail.com
Tue Aug 31 10:45:03 PDT 2010


On Tue, Aug 31, 2010 at 7:20 PM, Ian Romanick <idr at freedesktop.org> wrote:

> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Marek Ol?ák wrote:
> > On Tue, Aug 31, 2010 at 2:14 AM, Ian Romanick <idr at freedesktop.org
> > <mailto:idr at freedesktop.org>> wrote:
> >
> >     While I was trying to get one of the Humus demos working today, it
> >     occurred to me that we can possibly do better than
> >     ir_vec_index_to_cond_assign to lower variable indexing of vectors.
>  In
> >     addition to using conditional assignment, we can also use a
> dot-product
> >     to pick a single element out of a vector.  The variable index
> operation
> >     becomes:
> >
> >     const vec4 gl_vec_selector[4] =
> >        vec4[4](vec4(1.0, 0.0, 0.0, 0.0),
> >                vec4(0.0, 1.0, 0.0, 0.0),
> >                vec4(0.0, 0.0, 1.0, 0.0),
> >                vec4(0.0, 0.0, 0.0, 1.0));
> >
> >     ...
> >
> >     float f = dot(v, gl_vec_selector[i]);
>
> [snip]
>
> > Neither r300 nor r500 supports the ARL opcode in fragment shaders (it's
>
> Meh.  I always forget about that asymmetry.
>
> > a D3D10 feature), which kind of makes this optimization a no-go. I
> > suggest using SEQ instead:
> >
> > bvec4 selector = equal(vec4(i), vec4(0,1,2,3));
> > float f = dot(v, vec4(selector));
> >
> > which should end up being just SEQ followed by DP4.
>
> SEQ isn't part of the ARB_fragment_program / ARB_vertex_program
> instruction set either.  Does R300 support that?  I won't be surprised
> if i915 doesn't.  Of course, it doesn't support the ARL-based
> optimization either.
>

R500 vertex shaders support SEQ natively.

For R300 vertex shaders, SEQ is lowered to a sequence of opcodes SGE, SGE,
MUL. (because CMP is unsupported in hw)

For R300-R500 fragment shaders, SEQ is lowered to a sequence of opcodes ADD,
CMP. (because SGE is unsupported in hw)

SEQ is probably the best "high-level" instruction here, but I am ok with
anything other than ARL.

Marek


> SEQ is itself lowered to a sequence of instructions:
>
>        SGE     t0, i.xxxx, { 0,  1,  2,  3};
>        SGE     t1, i.xxxx, {-0, -1, -2, -3};
>        MUL     selector, t0, t1;
>        DP4     result, v, selector;
>
> So, that probably still would be better than the mess that gets
> generated today.
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.10 (GNU/Linux)
> Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/
>
> iEYEARECAAYFAkx9OfMACgkQX1gOwKyEAw+HlQCfZCqRdGwn3x9k06/qReehqMSf
> nCoAnjKMPoUlb/4QXeHW0EYOozlLpQn2
> =X6R+
> -----END PGP SIGNATURE-----
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.freedesktop.org/archives/mesa-dev/attachments/20100831/ebfef9e4/attachment.html>


More information about the mesa-dev mailing list