[Mesa-dev] [PATCH 1/4] meson: Enable SSE4.1 optimizations
Emil Velikov
emil.l.velikov at gmail.com
Mon Nov 20 19:09:16 UTC 2017
On 20 November 2017 at 18:02, Matt Turner <mattst88 at gmail.com> wrote:
> On Mon, Nov 20, 2017 at 3:47 AM, Emil Velikov <emil.l.velikov at gmail.com> wrote:
>> On 17 November 2017 at 20:46, Matt Turner <mattst88 at gmail.com> wrote:
>>> On Fri, Nov 17, 2017 at 12:34 PM, Dylan Baker <dylan at pnwbakers.com> wrote:
>>>> Quoting Emil Velikov (2017-11-17 03:11:50)
>>>>> On 16 November 2017 at 22:21, Dylan Baker <dylan at pnwbakers.com> wrote:
>>>>> > Quoting Emil Velikov (2017-11-16 03:35:17)
>>>>> >> Hi Dylan,
>>>>> >>
>>>>> >> On 16 November 2017 at 01:10, Dylan Baker <dylan at pnwbakers.com> wrote:
>>>>> >> > This patch checks for an and then enables sse4.1 optimizations if the
>>>>> >> > host machine will be x86/x86_64.
>>>>> >> >
>>>>> >> Hell yeah, SSE is coming to town :-)
>>>>> >>
>>>>> >> Will this work if the user disables SSE4.1, say via CFLAGS=-mno-sse4.1
>>>>> >> meson ...?
>>>>> >> My meson is still bit rough, so I could not quite grok ^^ by reading
>>>>> >> through the patch.
>>>>> >>
>>>>> >> Thanks
>>>>> >> Emil
>>>>> >
>>>>> > It'll explode horribly. Id didn't see any special handling of that in autotools
>>>>> > build though either, did I miss something?
>>>>> >
>>>>> In autotools it's handled before the normal ld invocation.
>>>>>
>>>>> Namely: configure.ac does:
>>>>> - construct a program using sse4.1 intrinsicts
>>>>> Note: return _mm_...() is required otherwise the whole program will be
>>>>> optimised away
>>>>> - the -msse is passed first and then the user flags (-mno-sse and/or
>>>>> anything else)
>>>>> - the user -mno-sse takes precedence, hence the test program fails to build
>>>>> - set see_supported=false and don't build the SSE optimised static library
>>>>>
>>>>> HTH
>>>>> Emil
>>>>
>>>> That's an interesting question. So arguments passed via CFLAGS and friends will
>>>> be passed to tests, but the arguments passed explicitly to those tests are
>>>> appended, so -msse4.1 will take precedence. I'm also pretty sure there isn't a
>>>> way to check the arguments passed via -Dc_args or CFLAGS (they're treated as
>>>> default arguments, like the c_std in the project() argument). I asked on
>>>> #mesonbuild, but I haven't gotten an answer yet (Fridays are pretty slow
>>>> everywhere).
>>>>
>>>> I think currently the only way to control this would be to have a meson option
>>>> to turn off optimizations, and I really don't like that.
>>>
>>> The original bug report of "Mesa doesn't build with -mno-sse4.1" was
>>> in Gentoo's bugzilla; https://bugs.gentoo.org/503828
>>>
>>> There's no compelling reason to support that configuration because
>>> since the -msseX flags are off by default... in order for the
>>> -mno-sseX flag to do anything the user must be enabling them somehow
>>> (likely via -march=...). Using -march=... only to disable particular
>>> instruction sets seems pretty idiotic.
>>>
>> I'm confused - it isn't the user but Mesa's build system which enables
>> -msseX, right?
>> Using -msseX is a good thing, but if the binary produced causes bugs
>> the builder/user has no way to disable it.
>
> There are two cases: code enabled at compile-time and code enabled at runtime.
>
> At compile-time we choose either the SSE2 or SSSE3 paths in
> src/mesa/drivers/dri/i965/intel_tiled_memcpy.c, based only on whether
> the compiler is enabled to use those instruction sets. If -mssse3 or
> some -march=... value is specified that enables SSSE3, then the SSSE3
> code will be enabled. Otherwise the SSE2 code is enabled (even on
> 32-bit systems since we always add -msse2 to CFLAGS in i965's
> Makefile.am).
>
> At runtime, we choose whether to execute
> src/mesa/main/streaming-load-memcpy.c based on whether the CPU
> supports the SSE 4.1 instruction set. It's safe (and good!) to always
> build the code. It won't be executed unless supported.
>
> The problem is that in order to build it, we have to tell the compiler
> it's okay to use SSE 4.1 instructions. Think of Debian for instance.
> They want to ship builds that will run on the oldest AMD64 system
> (without SSE 4.1) but would also like the code to exist and be
> executed where possible. As a result we need to build just that source
> file with -msse4.1. Again, this is safe because we have code that
> checks the CPU's capabilities at runtime and decides whether its safe
> to execute it.
>
In case I wasn't clear earlier - I fully support adding optimised hot paths.
> Now, a clever user specifies -mno-sse4.1. This throws a wrench into
> the system. Depending on the order the CFLAGS appear, the check of
> "does my compiler support emitting SSE 4.1 instructions" will succeed
> or fail.
>
> The reasons users in the bug [1] seem to be using -mno-sse4.1 are (1)
> no reason, using *all* the CFLAGS; (2) building packages on a system
> with SSE 4.1 for a system without (i.e., using the wrong -march= value
> and trying to compensate); (3) using -march=native with distcc, which
> cannot work. All of these are misuses.
>
> To your question of "what if users need a way to disable the code
> because of bugs?": Sorry, that's absurd. CFLAGS aren't a solution to
> bugs. Also, the code just had its fourth birthday and the only bugs
> have been build system madness (like [1]).
>
Was under the impression that using -mno-sseX or alike is a common way
to workaround issues.
Seems like that's not the case - in which case forget I said anything.
Thanks for the help Matt.
Emil
More information about the mesa-dev
mailing list