[Mesa-dev] [PATCH RFC] mesa: add SSE optimisation for glDrawElements

Timothy Arceri t_arceri at yahoo.com.au
Thu Oct 23 02:13:37 PDT 2014

On Wed, 2014-10-22 at 22:49 -0700, Matt Turner wrote:
> On Wed, Oct 22, 2014 at 10:30 PM, Matt Turner <mattst88 at gmail.com> wrote:
> > On Wed, Oct 22, 2014 at 9:02 PM, Timothy Arceri <t_arceri at yahoo.com.au> wrote:
> >> I almost wasn't going to bother sending this out since it uses SSE4.1
> >> and its recommended to use glDrawRangeElements anyway. But since these games
> >> are still ofter used for benchmarking I thought I'd see if anyone is
> >> interested in this. I only optimised GL_UNSIGNED_INT as that was the
> >> only place these games were hitting but I guess it wouldn't hurt
> >> to optimse the other cases too.
> >
> > I think it's kind of neat!
> >
> > It might also be fun to try to do this with OpenMP. OpenMP 3.1
> > (supported since gcc-4.7) supports min/max reduction operators.

I've never really looked into OpenMP before, but very cool :)

It seems simd support wasn't added until 4.0 (gcc-4.9) so using 3.1
would require threading. Probably best just to go with 4.0.

> I think all you'd need to do for that is to add this pragma
> immediately before the for loop in vbo_exec_array.c:
> #if _OPENMP > ... (have to figure out the date for OMP 3.1)
> #pragma omp simd reduction(max:max_ui) reduction(min:min_ui).
> #endif
> and then change the inner loop to use ternary for min/max:
> max_ui = ui_indices[i] > max_ui ? ui_indices[i] : max_ui;
> min_ui = ui_indices[i] < min_ui ? ui_indices[i] : min_ui;
> I tested it with a little function and confirmed that it generates
> SSE4.1/AVX2 instructions (and even a bunch of SSE2 instructions when
> 4.1 isn't available!) depending on the -march= value I pass.

I assume this means there isn't a way to tell OpenMP to build multiple
versions and select the best one at runtime, so distros would always
just ship SSE2? Anyway I'm going to give the SSE2 code a run on my (6
year old) desktop and see how it performs. I will also compare it to my
SSE4.1 code on my laptop maybe it won't be to big of a difference.

More information about the mesa-dev mailing list