[Mesa-dev] [PATCH V2] mesa: add SSE optimisation for glDrawElements
Timothy Arceri
t_arceri at yahoo.com.au
Sat Oct 25 16:17:58 PDT 2014
On Fri, 2014-10-24 at 09:11 -0700, Matt Turner wrote:
> On Fri, Oct 24, 2014 at 5:47 AM, Timothy Arceri <t_arceri at yahoo.com.au> wrote:
> > Makes use of SSE to speed up compute of min and max elements
> >
> > Callgrind cpu usage results from pts benchmarks:
> >
> > Openarena 0.8.8: 3.67% -> 1.03%
> > UrbanTerror: 2.36% -> 0.81%
> >
> > Signed-off-by: Timothy Arceri <t_arceri at yahoo.com.au>
> > ---
> > src/mesa/Android.libmesa_dricore.mk | 3 +-
> > src/mesa/Makefile.am | 3 +-
> > src/mesa/Makefile.sources | 1 +
> > src/mesa/main/sse_minmax.c | 81 +++++++++++++++++++++++++++++++++++++
> > src/mesa/main/sse_minmax.h | 30 ++++++++++++++
> > src/mesa/vbo/vbo_exec_array.c | 13 ++++--
> > 6 files changed, 126 insertions(+), 5 deletions(-)
> > create mode 100644 src/mesa/main/sse_minmax.c
> > create mode 100644 src/mesa/main/sse_minmax.h
> >
> > This version includes all the suggestions from Brian and Matt, thanks for
> > the review guys.
> >
> > I haven't been able to do Matt's suggestion and compare this to what OpenMP
> > would generate as I only have one machine that supports SSE4.1 with Fedora 20 and
> > I dont want to have to upgrade to Fedora 21 alpha (gcc 4.9) just to test this
> > (although I did consider it). If people are happy with this code I will revisit
> > OpenMP for Mesa 10.5 and will look at using OpenMP for the short and byte support too.
> >
> > diff --git a/src/mesa/Android.libmesa_dricore.mk b/src/mesa/Android.libmesa_dricore.mk
> > index 1e6d948..52d626f 100644
> > --- a/src/mesa/Android.libmesa_dricore.mk
> > +++ b/src/mesa/Android.libmesa_dricore.mk
> > @@ -51,7 +51,8 @@ endif # MESA_ENABLE_ASM
> >
> > ifeq ($(ARCH_X86_HAVE_SSE4_1),true)
> > LOCAL_SRC_FILES += \
> > - $(SRCDIR)main/streaming-load-memcpy.c
> > + $(SRCDIR)main/streaming-load-memcpy.c \
> > + $(SRCDIR)main/sse_minmax.c
> > LOCAL_CFLAGS := -msse4.1
> > endif
> >
> > diff --git a/src/mesa/Makefile.am b/src/mesa/Makefile.am
> > index e71bccb..932db4f 100644
> > --- a/src/mesa/Makefile.am
> > +++ b/src/mesa/Makefile.am
> > @@ -151,7 +151,8 @@ libmesagallium_la_LIBADD = \
> > $(ARCH_LIBS)
> >
> > libmesa_sse41_la_SOURCES = \
> > - main/streaming-load-memcpy.c
> > + main/streaming-load-memcpy.c \
> > + main/sse_minmax.c
> > libmesa_sse41_la_CFLAGS = $(AM_CFLAGS) -msse4.1
> >
> > pkgconfigdir = $(libdir)/pkgconfig
> > diff --git a/src/mesa/Makefile.sources b/src/mesa/Makefile.sources
> > index 4755018..dd10574 100644
> > --- a/src/mesa/Makefile.sources
> > +++ b/src/mesa/Makefile.sources
> > @@ -93,6 +93,7 @@ MAIN_FILES = \
> > $(SRCDIR)main/shaderobj.c \
> > $(SRCDIR)main/shader_query.cpp \
> > $(SRCDIR)main/shared.c \
> > + $(SRCDIR)main/sse_minmax.c \
>
> We can't add this here. You've already added it to libmesa_sse41.la above.
I added this without thinking about it to much after Brian said it was
probably needed for SCons. Obviously we cant have both so I'll remove it
from here. I don't know enough about Scons to know what will require or
how to fix it.
More information about the mesa-dev
mailing list