[Mesa-dev] [PATCH][RFC] mesa/main: Clamp rgba with streamed sse
Timothy Arceri
t_arceri at yahoo.com.au
Mon Nov 3 01:39:56 PST 2014
On Fri, 2014-10-31 at 17:24 +0000, Jose Fonseca wrote:
> On 31/10/14 17:01, Matt Turner wrote:
> > On Fri, Oct 31, 2014 at 4:12 AM, Jose Fonseca <jfonseca at vmware.com> wrote:
> >> On 31/10/14 10:13, Juha-Pekka Heikkila wrote:
> >>>
> >>> defined(__SSE2__) && defined(__GNUC__)
> >>
> >>
> >> Instead of duplicate this expression everywhere lets create a
> >> "HAVE_SSE2_INTRIN" define. Not only this expression is complex, it will
> >> become even more when we updated it for MSVC.
> >
> > Isn't testing __SSE2__ sufficient? Does MSVC not do this?
> >
> > clang/icc/gcc all implement this and all of the _mm_* intrinsics.
> >
>
> No, __SSE2__ is a GCC-only macro. It's not defined or needed by MSVC
> compilers. And I strongly suspect that Intel compiler probably only
> defines it for GCC compatibility.
>
>
> This is because GCC is quite lame IMO: it can't distinguish between
> "enabling SSE intrinsics" (ie, allow including emmintrin.h and use the
> Intel _mm_* instrincis) and emitting SSE2 opcodes own its own accord.
> That is, when you pass -msse2 to GCC, you're also giving carte blache
> for GCC to emit SSE2 opcodes for any C code! Which makes it _very_ hard
> to have special code paths for SSE1/2/3/4/etc and no SSE. Since you
> basically need to compile each path in a different C module, passing
> different -msse* flags to each.
So does anyone have a suggestion how this can be better organised? As in
should there be an SSE folder somewhere?
Currently streaming-load-memcpy.c is in mesa/main even though its only
used by the intel driver, also my patch adds another file there and I've
also noticed this [1] which should be made to use a runtime switch too.
Dumping everything in Mesa main would obviously get messy fast.
[1]
http://cgit.freedesktop.org/mesa/mesa/tree/src/mesa/drivers/dri/i965/intel_tex_subimage.c#n199
>
> Whereas on MSVC, you can #include emmintrin any time, any where, and
> only the code that uses the intrinsics will generate those opcodes. So
> you can have a awesomeFuncionC(), awesomeFunctionSSE2(),
> awesomeFunctionAVX() all next to each other, and a switch table to jump
> into them.
>
>
> In other words, on MSVC, instead of
>
> #if defined(__SSE2__) && defined(__GNUC__)
>
> all you need is
>
> #if 1
>
> or
>
> #if defined(_M_IX86) || defined(_M_X64)
>
> if you want the code not to cause problems when targetting non-x86
> architectures.
>
>
>
> Of course there's some merit in GCC emiting SSE instructions for plain C
> code, but let's face it: virtually all the code that can benefit from
> SIMD is too complex to be auto-vectorized by compilers, and need humans
> writing code with SSE intrincs. So GCC is effectively tailored to make
> the rare thing easy, at the expense of making the common thing hard...
>
>
> I believe recent GCC versions have better support for having specialized
> SSE code side-by-side. But from what I remember of it, is all pretty
> non-standard and GCC specific, so still pretty useless for portable code.
>
>
> Jose
> _______________________________________________
> mesa-dev mailing list
> mesa-dev at lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
More information about the mesa-dev
mailing list