[Mesa-dev] [PATCH V2] mesa: add SSE optimisation for glDrawElements

Matt Turner mattst88 at gmail.com
Fri Oct 24 14:19:32 PDT 2014


On Fri, Oct 24, 2014 at 2:06 PM, Ian Romanick <idr at freedesktop.org> wrote:
> On 10/24/2014 05:47 AM, Timothy Arceri wrote:
>> +      vec_count = count & ~0x3;
>> +      ui_indices_ptr = (__m128i*)ui_indices;
>> +      for (i = 0; i < vec_count / 4; i++) {
>> +         ui_indices4 = _mm_loadu_si128(&ui_indices_ptr[i]);
>
> How does this fare with unaligned data?  My recollection is that
> _mm_loadu_si128 could be quite a bit slower than _mm_load_si128.  It
> might be worth handling the first few values without SSE until the
> pointer is aligned.
>
> Or my memory might be wrong.

Nope, that's a good suggestion. pixman does this a lot:
http://cgit.freedesktop.org/pixman/tree/pixman/pixman-sse2.c#n582


More information about the mesa-dev mailing list