[Mesa-dev] [PATCH v2] replace file specific compiler optimization withinline attibute

Marc Dietrich marvin24 at gmx.de
Wed Sep 24 06:25:56 PDT 2014


Hi Matt,

Am Montag, 22. September 2014, 11:48:29 schrieb Matt Turner:
> On Fri, Sep 12, 2014 at 4:56 AM, Marc Dietrich <marvin24 at gmx.de> wrote:
> > File specific optimization as used for src/mesa/main/streaming-load-memcpy.c
> > currently will cause problems with LTO in the future
> > (see: https://bugs.freedesktop.org/show_bug.cgi?id=83669). Replace it with
> > in-file target specification.
> >
> > This only works for gcc for now. The intel compiler has
> > __attribute__((cpu_specific(cpuid))) with cpuid core_2_duo_sse4_1, but I'm
> > not sure if mesa compiles with it and how it behaves on different platforms.
> >
> > V2: limit support for this optimization to gcc >= 4.4 only
> >
> > Signed-off-by: Marc Dietrich <marvin24 at gmx.de>
> > ---
> > This is the simplest solution I see for now. Drawback is that gcc < 4.4 cannot
> > make use of this single improvement anymore. Otherwise we would have to maintain
> > a nightmare of version checks for different compilers and different pragmas.
> >
> >  configure.ac                          | 10 ++++++----
> >  src/mesa/Makefile.am                  |  1 -
> >  src/mesa/main/streaming-load-memcpy.c |  4 ++--
> >  3 files changed, 8 insertions(+), 7 deletions(-)
> >
> > diff --git a/configure.ac b/configure.ac
> > index 29cf32e..52ff00a 100644
> > --- a/configure.ac
> > +++ b/configure.ac
> > @@ -240,11 +240,13 @@ AC_SUBST([VISIBILITY_CXXFLAGS])
> >  dnl
> >  dnl Optional flags, check for compiler support
> >  dnl
> > -AX_CHECK_COMPILE_FLAG([-msse4.1], [SSE41_SUPPORTED=1], [SSE41_SUPPORTED=0])
> 
> This is the only use of AX_CHECK_COMPILE_FLAG, so we can remove
> m4/ax_check_compile_flag.m4.

well, I guess this test could be used more often in the future when more special
acceleration paths are added. On the other hand, the test for the gcc version is
enough in this case.

> > -if test "x$SSE41_SUPPORTED" = x1; then
> > -    DEFINES="$DEFINES -DUSE_SSE41"
> > +if test 0$GCC_VERSION_MAJOR$GCC_VERSION_MINOR -ge 44; then
> > +    AX_CHECK_COMPILE_FLAG([-msse4.1], [SSE41_SUPPORTED=1], [SSE41_SUPPORTED=0])
> > +    if test "x$SSE41_SUPPORTED" = x1; then
> > +        DEFINES="$DEFINES -DUSE_SSE41"
> > +    fi
> > +    AM_CONDITIONAL([SSE41_SUPPORTED], [test x$SSE41_SUPPORTED = x1])
> >  fi
> > -AM_CONDITIONAL([SSE41_SUPPORTED], [test x$SSE41_SUPPORTED = x1])
> >
> >  dnl Can't have static and shared libraries, default to static if user
> >  dnl explicitly requested. If both disabled, set to static since shared
> > diff --git a/src/mesa/Makefile.am b/src/mesa/Makefile.am
> > index e71bccb..bbdb36f 100644
> > --- a/src/mesa/Makefile.am
> > +++ b/src/mesa/Makefile.am
> > @@ -152,7 +152,6 @@ libmesagallium_la_LIBADD = \
> >
> >  libmesa_sse41_la_SOURCES = \
> >         main/streaming-load-memcpy.c
> > -libmesa_sse41_la_CFLAGS = $(AM_CFLAGS) -msse4.1
> 
> We don't need libmesa_sse41.la anymore, so just remove this by adding
> main/streaming-load-memcpy.c to libmesa.la's sources (pending figuring
> out how to handle clang support, see below).

ok.

> >
> >  pkgconfigdir = $(libdir)/pkgconfig
> >  pkgconfig_DATA = gl.pc
> > diff --git a/src/mesa/main/streaming-load-memcpy.c b/src/mesa/main/streaming-load-memcpy.c
> > index 8427149..94b0e0a 100644
> > --- a/src/mesa/main/streaming-load-memcpy.c
> > +++ b/src/mesa/main/streaming-load-memcpy.c
> > @@ -26,7 +26,7 @@
> >   *
> >   */
> >
> > -#ifdef __SSE4_1__
> > +#ifdef USE_SSE41
> >  #include "main/macros.h"
> >  #include "main/streaming-load-memcpy.h"
> >  #include <smmintrin.h>
> 
> This header can only be included when __SSE4_1__ is defined (which is
> enabled today when gcc is using an appropriate -march= setting or
> -msse4.1. How do you propose we work around that? I suppose #defining
> __SSE4_1__ before the inclusion is probably okay?

looking at the header file, I think it can be included, but I'm not sure
what the outcome will be. An alternative is to use the gcc pragma as shown
in the header:

#ifdef USE_SSE41
#pragma GCC push_options
#pragma GCC target("sse4.1")
#include <stuff>
void my_nice_see_func()
#pragma GCC pop_options
 
> > @@ -34,7 +34,7 @@
> >  /* Copies memory from src to dst, using SSE 4.1's MOVNTDQA to get streaming
> >   * read performance from uncached memory.
> >   */
> > -void
> > +void __attribute__ ((target("sse4.1")))
> 
> We need a configure check for support for __attribute__((target)). I'm
> going to send a series that adds support for this (and does the check
> for existing attribute uses, so once that goes in you can rebase this
> patch on that).

nice, but won't work with the workaround above. Pragma and attribute does the same
so, we could check for the attribute and use the pragma instead.

> 
> It doesn't look like clang supports this though, which would be nice
> to support... don't know what to do. clang does support the existing
> -msse4.1 flag, so this would be a regression there. Any ideas for
> handling that?

clang reports gcc 4.2 compatibilty, so USE_SSE4.1 would not be set (tests for
gcc >= 4.4).

Marc
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 490 bytes
Desc: This is a digitally signed message part.
URL: <http://lists.freedesktop.org/archives/mesa-dev/attachments/20140924/13bc556e/attachment.sig>


More information about the mesa-dev mailing list