[Mesa-dev] [PATCH RFC] mesa: add SSE optimisation for glDrawElements

Matt Turner mattst88 at gmail.com
Wed Oct 22 22:49:04 PDT 2014

On Wed, Oct 22, 2014 at 10:30 PM, Matt Turner <mattst88 at gmail.com> wrote:
> On Wed, Oct 22, 2014 at 9:02 PM, Timothy Arceri <t_arceri at yahoo.com.au> wrote:
>> I almost wasn't going to bother sending this out since it uses SSE4.1
>> and its recommended to use glDrawRangeElements anyway. But since these games
>> are still ofter used for benchmarking I thought I'd see if anyone is
>> interested in this. I only optimised GL_UNSIGNED_INT as that was the
>> only place these games were hitting but I guess it wouldn't hurt
>> to optimse the other cases too.
> I think it's kind of neat!
> It might also be fun to try to do this with OpenMP. OpenMP 3.1
> (supported since gcc-4.7) supports min/max reduction operators.

I think all you'd need to do for that is to add this pragma
immediately before the for loop in vbo_exec_array.c:

#if _OPENMP > ... (have to figure out the date for OMP 3.1)
#pragma omp simd reduction(max:max_ui) reduction(min:min_ui).

and then change the inner loop to use ternary for min/max:

max_ui = ui_indices[i] > max_ui ? ui_indices[i] : max_ui;
min_ui = ui_indices[i] < min_ui ? ui_indices[i] : min_ui;

I tested it with a little function and confirmed that it generates
SSE4.1/AVX2 instructions (and even a bunch of SSE2 instructions when
4.1 isn't available!) depending on the -march= value I pass.

More information about the mesa-dev mailing list