[Mesa-dev] [PATCH 1/4] meson: Enable SSE4.1 optimizations

Matt Turner mattst88 at gmail.com
Mon Nov 20 18:02:49 UTC 2017


On Mon, Nov 20, 2017 at 3:47 AM, Emil Velikov <emil.l.velikov at gmail.com> wrote:
> On 17 November 2017 at 20:46, Matt Turner <mattst88 at gmail.com> wrote:
>> On Fri, Nov 17, 2017 at 12:34 PM, Dylan Baker <dylan at pnwbakers.com> wrote:
>>> Quoting Emil Velikov (2017-11-17 03:11:50)
>>>> On 16 November 2017 at 22:21, Dylan Baker <dylan at pnwbakers.com> wrote:
>>>> > Quoting Emil Velikov (2017-11-16 03:35:17)
>>>> >> Hi Dylan,
>>>> >>
>>>> >> On 16 November 2017 at 01:10, Dylan Baker <dylan at pnwbakers.com> wrote:
>>>> >> > This patch checks for an and then enables sse4.1 optimizations if the
>>>> >> > host machine will be x86/x86_64.
>>>> >> >
>>>> >> Hell yeah, SSE is coming to town :-)
>>>> >>
>>>> >> Will this work if the user disables SSE4.1, say via CFLAGS=-mno-sse4.1
>>>> >> meson ...?
>>>> >> My meson is still bit rough, so I could not quite grok ^^ by reading
>>>> >> through the patch.
>>>> >>
>>>> >> Thanks
>>>> >> Emil
>>>> >
>>>> > It'll explode horribly. Id didn't see any special handling of that in autotools
>>>> > build though either, did I miss something?
>>>> >
>>>> In autotools it's handled before the normal ld invocation.
>>>>
>>>> Namely: configure.ac does:
>>>>  - construct a program using sse4.1 intrinsicts
>>>> Note: return _mm_...() is required otherwise the whole program will be
>>>> optimised away
>>>>  - the -msse is passed first and then the user flags (-mno-sse and/or
>>>> anything else)
>>>>  - the user -mno-sse takes precedence, hence the test program fails to build
>>>>  - set see_supported=false and don't build the SSE optimised static library
>>>>
>>>> HTH
>>>> Emil
>>>
>>> That's an interesting question. So arguments passed via CFLAGS and friends will
>>> be passed to tests, but the arguments passed explicitly to those tests are
>>> appended, so -msse4.1 will take precedence. I'm also pretty sure there isn't a
>>> way to check the arguments passed via -Dc_args or CFLAGS (they're treated as
>>> default arguments, like the c_std in the project() argument). I asked on
>>> #mesonbuild, but I haven't gotten an answer yet (Fridays are pretty slow
>>> everywhere).
>>>
>>> I think currently the only way to control this would be to have a meson option
>>> to turn off optimizations, and I really don't like that.
>>
>> The original bug report of "Mesa doesn't build with -mno-sse4.1" was
>> in Gentoo's bugzilla; https://bugs.gentoo.org/503828
>>
>> There's no compelling reason to support that configuration because
>> since the -msseX flags are off by default... in order for the
>> -mno-sseX flag to do anything the user must be enabling them somehow
>> (likely via -march=...). Using -march=... only to disable particular
>> instruction sets seems pretty idiotic.
>>
> I'm confused - it isn't the user but Mesa's build system which enables
> -msseX, right?
> Using -msseX is a good thing, but if the binary produced causes bugs
> the builder/user has no way to disable it.

There are two cases: code enabled at compile-time and code enabled at runtime.

At compile-time we choose either the SSE2 or SSSE3 paths in
src/mesa/drivers/dri/i965/intel_tiled_memcpy.c, based only on whether
the compiler is enabled to use those instruction sets. If -mssse3 or
some -march=... value is specified that enables SSSE3, then the SSSE3
code will be enabled. Otherwise the SSE2 code is enabled (even on
32-bit systems since we always add -msse2 to CFLAGS in i965's
Makefile.am).

At runtime, we choose whether to execute
src/mesa/main/streaming-load-memcpy.c based on whether the CPU
supports the SSE 4.1 instruction set. It's safe (and good!) to always
build the code. It won't be executed unless supported.

The problem is that in order to build it, we have to tell the compiler
it's okay to use SSE 4.1 instructions. Think of Debian for instance.
They want to ship builds that will run on the oldest AMD64 system
(without SSE 4.1) but would also like the code to exist and be
executed where possible. As a result we need to build just that source
file with -msse4.1. Again, this is safe because we have code that
checks the CPU's capabilities at runtime and decides whether its safe
to execute it.

Now, a clever user specifies -mno-sse4.1. This throws a wrench into
the system. Depending on the order the CFLAGS appear, the check of
"does my compiler support emitting SSE 4.1 instructions" will succeed
or fail.

The reasons users in the bug [1] seem to be using -mno-sse4.1 are (1)
no reason, using *all* the CFLAGS; (2) building packages on a system
with SSE 4.1 for a system without (i.e., using the wrong -march= value
and trying to compensate); (3) using -march=native with distcc, which
cannot work. All of these are misuses.

To your question of "what if users need a way to disable the code
because of bugs?": Sorry, that's absurd. CFLAGS aren't a solution to
bugs. Also, the code just had its fourth birthday and the only bugs
have been build system madness (like [1]).

[1] https://bugs.gentoo.org/503828


More information about the mesa-dev mailing list