[Mesa-dev] [PATCH] gallium/util: Fix detection of AVX cpu caps
Andre Heider
a.heider at gmail.com
Tue Jul 23 14:44:45 PDT 2013
On Tue, Jul 23, 2013 at 8:57 PM, Roland Scheidegger <sroland at vmware.com> wrote:
> Am 23.07.2013 19:08, schrieb Andre Heider:
>> For AVX it's not sufficient to only rely on the cpuid flags. If the CPU
>> supports these extensions, but the OS doesn't, issuing these insns will
>> trigger an undefined opcode exception.
>>
>> In addition to the AVX cpuid bit we also need to:
>> * test cpuid for OSXSAVE support
>> * XGETBV to check if the OS saves/restores AVX regs on context switches
>>
>> See "Detecting Availability and Support" at
>> http://software.intel.com/en-us/articles/introduction-to-intel-advanced-vector-extensions
>>
>> Signed-off-by: Andre Heider <a.heider at gmail.com>
>> ---
>> src/gallium/auxiliary/util/u_cpu_detect.c | 27 +++++++++++++++++++++++++--
>> 1 file changed, 25 insertions(+), 2 deletions(-)
>>
>> diff --git a/src/gallium/auxiliary/util/u_cpu_detect.c b/src/gallium/auxiliary/util/u_cpu_detect.c
>> index b118fc8..588fc7c 100644
>> --- a/src/gallium/auxiliary/util/u_cpu_detect.c
>> +++ b/src/gallium/auxiliary/util/u_cpu_detect.c
>> @@ -67,7 +67,7 @@
>>
>> #if defined(PIPE_OS_WINDOWS)
>> #include <windows.h>
>> -#if defined(MSVC)
>> +#if defined(PIPE_CC_MSVC)
>> #include <intrin.h>
>> #endif
>> #endif
>> @@ -211,6 +211,27 @@ cpuid(uint32_t ax, uint32_t *p)
>> p[3] = 0;
>> #endif
>> }
>> +
>> +static INLINE uint64_t xgetbv(void)
>> +{
>> +#if defined(PIPE_CC_GCC)
>> + uint32_t eax, edx;
>> +
>> + __asm __volatile (
>> + ".byte 0x0f, 0x01, 0xd0" // xgetbv isn't supported on gcc < 4.4
>> + : "=a"(eax),
>> + "=d"(edx)
>> + : "c"(0)
>> + );
>> +
>> + return ((uint64_t)edx << 32) | eax;
>> +#elif defined(PIPE_CC_MSVC) && defined(_MSC_FULL_VER) && defined(_XCR_XFEATURE_ENABLED_MASK)
>> + return _xgetbv(_XCR_XFEATURE_ENABLED_MASK);
>> +#else
>> + return 0;
>> +#endif
>> +
>> +}
>> #endif /* X86 or X86_64 */
>>
>> void
>> @@ -284,7 +305,9 @@ util_cpu_detect(void)
>> util_cpu_caps.has_sse4_1 = (regs2[2] >> 19) & 1;
>> util_cpu_caps.has_sse4_2 = (regs2[2] >> 20) & 1;
>> util_cpu_caps.has_popcnt = (regs2[2] >> 23) & 1;
>> - util_cpu_caps.has_avx = (regs2[2] >> 28) & 1;
>> + util_cpu_caps.has_avx = ((regs2[2] >> 28) & 1) && // AVX
>> + ((regs2[2] >> 27) & 1) && // OSXSAVE
>> + ((xgetbv() & 6) == 6); // XMM & YMM
>> util_cpu_caps.has_f16c = (regs2[2] >> 29) & 1;
>> util_cpu_caps.has_mmx2 = util_cpu_caps.has_sse; /* SSE cpus supports mmxext too */
>>
>>
>
> Looks good to me though it's a pity detection depends on compiler.
> Granted it looks like icc currently won't work but still...
> I guess that technically the test for sse(x) isn't correct neither as
> that too requires OS support, I don't know off-hand though how to check
> for it (and we'd be talking ANCIENT os here...).
Ancient indeed ;)
But with AVX the problem becomes more urgent: All SSE versions used
the same registers, AVX extended those.
Now we recently got a AVX enabled vSphere server, and exposing that to
XP guests doesn't go too well with llvmpipe without this patch.
Thanks,
Andre
More information about the mesa-dev
mailing list