[Mesa-dev] [PATCH] gallivm: force sse instructions for llvm 3.5+
Maarten Lankhorst
maarten.lankhorst at canonical.com
Thu Oct 2 00:31:04 PDT 2014
Op 02-10-14 om 09:10 schreef Maarten Lankhorst:
> Hey,
>
> Op 02-10-14 om 04:22 schreef Roland Scheidegger:
>> Am 01.10.2014 16:56, schrieb Maarten Lankhorst:
>>> This fixes a crash when llvmpipe tries to use sse instructions,
>>> but llvm detects a cpu that doesn't support them.
>>>
>>> Fixes for example piglit/bin/amd_seamless_cubemap_per_texture -fbo -auto
>>> on i386 when run inside "qemu -cpu qemu32", which would otherwise error with:
>>> "LLVM ERROR: Do not know how to split the result of this operator!"
>>>
>>> Signed-off-by: Maarten Lankhorst <maarten.lankhorst at canonical.com>
>>> ---
>>>
>>> diff --git a/src/gallium/auxiliary/gallivm/lp_bld_misc.cpp b/src/gallium/auxiliary/gallivm/lp_bld_misc.cpp
>>> index 55aa8b9..f2f8906 100644
>>> --- a/src/gallium/auxiliary/gallivm/lp_bld_misc.cpp
>>> +++ b/src/gallium/auxiliary/gallivm/lp_bld_misc.cpp
>>> @@ -479,10 +479,38 @@ lp_build_create_jit_compiler_for_module(LLVMExecutionEngineRef *OutJIT,
>>> if (util_cpu_caps.has_f16c) {
>>> MAttrs.push_back("+f16c");
>>> }
>>> - builder.setMAttrs(MAttrs);
>>> }
>>>
>>> #if HAVE_LLVM >= 0x0305
>>> + /*
>>> + * llvm 3.5 no longer supports cpuid based autodetect.
>>> + * This breaks on "qemu -cpu qemu32" which is detected as pentium2 by llvm's
>>> + * sys::getHostCPUName(), but does support sse2.
>>> + *
>>> + * For this reason force the use of sse extensions when available, so our
>>> + * understanding of the cpu is in sync with llvm's.
>>> + */
>>> +
>>> + else if (util_cpu_caps.has_sse4_2)
>>> + MAttrs.push_back("+sse42");
>>> + else if (util_cpu_caps.has_sse4_1)
>>> + MAttrs.push_back("+sse41");
>>> + else if (util_cpu_caps.has_ssse3)
>>> + MAttrs.push_back("+ssse3");
>>> + else if (util_cpu_caps.has_sse3)
>>> + MAttrs.push_back("+sse3");
>>> + else if (util_cpu_caps.has_sse2)
>>> + MAttrs.push_back("+sse2");
>>> + else if (util_cpu_caps.has_sse)
>>> + MAttrs.push_back("+sse");
>>> + else if (util_cpu_caps.has_mmx)
>>> + MAttrs.push_back("+sse");
>>> +
>>> + if (util_cpu_caps.has_3dnow_ext)
>>> + MAttrs.push_back("+3dnowa");
>>> + else if (util_cpu_caps.has_3dnow)
>>> + MAttrs.push_back("+3dnow");
>>> +
>>> StringRef MCPU = llvm::sys::getHostCPUName();
>>> /*
>>> * The cpu bits are no longer set automatically, so need to set mcpu manually.
>>> @@ -498,6 +526,7 @@ lp_build_create_jit_compiler_for_module(LLVMExecutionEngineRef *OutJIT,
>>> */
>>> builder.setMCPU(MCPU);
>>> #endif
>>> + builder.setMAttrs(MAttrs);
>>>
>>> ShaderMemoryManager *MM = new ShaderMemoryManager();
>>> *OutCode = MM->getGeneratedCode();
>>>
>> I'm not sure if that's really the right idea, it also misses half the
>> potential features (avx, f16c, ...).
>> Maybe using llvm::sys::getHostCPUFeatures() and passing that to mattrs
>> instead of setting the host cpu name would work?
>>
> I was unaware that getHostCPUFeatures existed, but I think passing both getHostCPUName and getHostCPUFeatures would work.
> Unfortunately it seems getHostCPUFeatures only exists on arm right now, but that should be fixable by re-introducing the cpuid stuff in llvm..
Hm no, it would seem it takes a stringmap, probably mapping whether certain features are available or not, and it doesn't create a nice string for passing along.
Regarding imirkin's comment, yes the mmx thing is a copy paste error, and newer versions of SSE/AVX imply the previous ones are supported.
I also don't enable all possible cpu extensions, just the ones used by llvmpipe and gallivm. This is to make sure there is no mismatch between
what llvmpipe expects and what llvm can make of it from the cpu string. I don't really care about the optimal code generation, I would just prefer there are no crashes. :-)
~Maarten
More information about the mesa-dev
mailing list