[Mesa-dev] [PATCH][RFC] mesa/main: Clamp rgba with streamed sse
Jose Fonseca
jfonseca at vmware.com
Fri Oct 31 10:24:20 PDT 2014
On 31/10/14 17:01, Matt Turner wrote:
> On Fri, Oct 31, 2014 at 4:12 AM, Jose Fonseca <jfonseca at vmware.com> wrote:
>> On 31/10/14 10:13, Juha-Pekka Heikkila wrote:
>>>
>>> defined(__SSE2__) && defined(__GNUC__)
>>
>>
>> Instead of duplicate this expression everywhere lets create a
>> "HAVE_SSE2_INTRIN" define. Not only this expression is complex, it will
>> become even more when we updated it for MSVC.
>
> Isn't testing __SSE2__ sufficient? Does MSVC not do this?
>
> clang/icc/gcc all implement this and all of the _mm_* intrinsics.
>
No, __SSE2__ is a GCC-only macro. It's not defined or needed by MSVC
compilers. And I strongly suspect that Intel compiler probably only
defines it for GCC compatibility.
This is because GCC is quite lame IMO: it can't distinguish between
"enabling SSE intrinsics" (ie, allow including emmintrin.h and use the
Intel _mm_* instrincis) and emitting SSE2 opcodes own its own accord.
That is, when you pass -msse2 to GCC, you're also giving carte blache
for GCC to emit SSE2 opcodes for any C code! Which makes it _very_ hard
to have special code paths for SSE1/2/3/4/etc and no SSE. Since you
basically need to compile each path in a different C module, passing
different -msse* flags to each.
Whereas on MSVC, you can #include emmintrin any time, any where, and
only the code that uses the intrinsics will generate those opcodes. So
you can have a awesomeFuncionC(), awesomeFunctionSSE2(),
awesomeFunctionAVX() all next to each other, and a switch table to jump
into them.
In other words, on MSVC, instead of
#if defined(__SSE2__) && defined(__GNUC__)
all you need is
#if 1
or
#if defined(_M_IX86) || defined(_M_X64)
if you want the code not to cause problems when targetting non-x86
architectures.
Of course there's some merit in GCC emiting SSE instructions for plain C
code, but let's face it: virtually all the code that can benefit from
SIMD is too complex to be auto-vectorized by compilers, and need humans
writing code with SSE intrincs. So GCC is effectively tailored to make
the rare thing easy, at the expense of making the common thing hard...
I believe recent GCC versions have better support for having specialized
SSE code side-by-side. But from what I remember of it, is all pretty
non-standard and GCC specific, so still pretty useless for portable code.
Jose
More information about the mesa-dev
mailing list