[Mesa-dev] [PATCH V5] mesa: add SSE optimisation for glDrawElements

Siavash Eliasi siavashserver at gmail.com
Fri Nov 7 06:09:09 PST 2014


On 11/07/2014 03:14 PM, Steven Newbury wrote:
> On Thu, 2014-11-06 at 21:00 -0800, Matt Turner wrote:
>> On Thu, Nov 6, 2014 at 8:56 PM, Siavash Eliasi <
>> siavashserver at gmail.com> wrote:
>>> Then I do recommend removing the "if (cpu_has_sse4_1)" from this
>>> patch and similar places, because there is no runtime CPU
>>> dispatching happening for SSE optimized code paths in action and
>>> just adds extra overhead (unnecessary branches) to the generated
>>> code.
>> No. Sorry, I realize I misread your previous question:
>>
>>>> I guess checking for "cpu_has_sse4_1" is unnecessary if it isn't
>>>> controllable by user at runtime; because "USE_SSE41" is a
>>>> compile time check and requires the target machine to be SSE 4.1
>>>> capable already.
>> USE_SSE41 is set if the *compiler* supports SSE 4.1. This allows you
>> to build the code and then use it only on systems that actually
>> support it.
>>
>> All of this could have been pretty easily answered by a few greps
>> though...
> I wonder what difference it would make to have an option to compile
> out the run-time check code to avoid the additional overhead in cases
> where the builder *knows* at compile time what the run-time system is?
> (ie Gentoo)
I think that's possible. Since "cpu_has_sse4_1" and friends are simply 
macros, one can set them to "true" or "1" during compile time if it's 
going to be built for an SSE 4.1 capable target so your smart compiler 
will totally get rid of the unnecessary runtime check.

I guess "common_x86_features.h" should be modified to something like this:

#ifdef __SSE4_1__
#define cpu_has_sse4_1 1
#else
#define cpu_has_sse4_1        (_mesa_x86_cpu_features & X86_FEATURE_SSE4_1)
#endif


More information about the mesa-dev mailing list