[Mesa-dev] [PATCH V2] mesa: add SSE optimisation for glDrawElements
Matt Turner
mattst88 at gmail.com
Fri Oct 24 14:19:32 PDT 2014
On Fri, Oct 24, 2014 at 2:06 PM, Ian Romanick <idr at freedesktop.org> wrote:
> On 10/24/2014 05:47 AM, Timothy Arceri wrote:
>> + vec_count = count & ~0x3;
>> + ui_indices_ptr = (__m128i*)ui_indices;
>> + for (i = 0; i < vec_count / 4; i++) {
>> + ui_indices4 = _mm_loadu_si128(&ui_indices_ptr[i]);
>
> How does this fare with unaligned data? My recollection is that
> _mm_loadu_si128 could be quite a bit slower than _mm_load_si128. It
> might be worth handling the first few values without SSE until the
> pointer is aligned.
>
> Or my memory might be wrong.
Nope, that's a good suggestion. pixman does this a lot:
http://cgit.freedesktop.org/pixman/tree/pixman/pixman-sse2.c#n582
More information about the mesa-dev
mailing list