[Mesa-dev] [PATCH] gallivm: force sse instructions for llvm 3.5+

Roland Scheidegger sroland at vmware.com
Thu Oct 2 07:04:02 PDT 2014


Am 02.10.2014 um 09:31 schrieb Maarten Lankhorst:
> Op 02-10-14 om 09:10 schreef Maarten Lankhorst:
>> Hey,
>>
>> Op 02-10-14 om 04:22 schreef Roland Scheidegger:
>>> Am 01.10.2014 16:56, schrieb Maarten Lankhorst:
>>>> This fixes a crash when llvmpipe tries to use sse instructions,
>>>> but llvm detects a cpu that doesn't support them.
>>>>
>>>> Fixes for example piglit/bin/amd_seamless_cubemap_per_texture -fbo -auto
>>>> on i386 when run inside "qemu -cpu qemu32", which would otherwise error with:
>>>> "LLVM ERROR: Do not know how to split the result of this operator!"
>>>>
>>>> Signed-off-by: Maarten Lankhorst <maarten.lankhorst at canonical.com>
>>>> ---
>>>>
>>>> diff --git a/src/gallium/auxiliary/gallivm/lp_bld_misc.cpp b/src/gallium/auxiliary/gallivm/lp_bld_misc.cpp
>>>> index 55aa8b9..f2f8906 100644
>>>> --- a/src/gallium/auxiliary/gallivm/lp_bld_misc.cpp
>>>> +++ b/src/gallium/auxiliary/gallivm/lp_bld_misc.cpp
>>>> @@ -479,10 +479,38 @@ lp_build_create_jit_compiler_for_module(LLVMExecutionEngineRef *OutJIT,
>>>>        if (util_cpu_caps.has_f16c) {
>>>>           MAttrs.push_back("+f16c");
>>>>        }
>>>> -      builder.setMAttrs(MAttrs);
>>>>     }
>>>>  
>>>>  #if HAVE_LLVM >= 0x0305
>>>> +   /*
>>>> +    * llvm 3.5 no longer supports cpuid based autodetect.
>>>> +    * This breaks on "qemu -cpu qemu32" which is detected as pentium2 by llvm's
>>>> +    * sys::getHostCPUName(), but does support sse2.
>>>> +    *
>>>> +    * For this reason force the use of sse extensions when available, so our
>>>> +    * understanding of the cpu is in sync with llvm's.
>>>> +    */
>>>> +
>>>> +   else if (util_cpu_caps.has_sse4_2)
>>>> +      MAttrs.push_back("+sse42");
>>>> +   else if (util_cpu_caps.has_sse4_1)
>>>> +      MAttrs.push_back("+sse41");
>>>> +   else if (util_cpu_caps.has_ssse3)
>>>> +      MAttrs.push_back("+ssse3");
>>>> +   else if (util_cpu_caps.has_sse3)
>>>> +      MAttrs.push_back("+sse3");
>>>> +   else if (util_cpu_caps.has_sse2)
>>>> +      MAttrs.push_back("+sse2");
>>>> +   else if (util_cpu_caps.has_sse)
>>>> +      MAttrs.push_back("+sse");
>>>> +   else if (util_cpu_caps.has_mmx)
>>>> +      MAttrs.push_back("+sse");
>>>> +
>>>> +   if (util_cpu_caps.has_3dnow_ext)
>>>> +      MAttrs.push_back("+3dnowa");
>>>> +   else if (util_cpu_caps.has_3dnow)
>>>> +      MAttrs.push_back("+3dnow");
>>>> +
>>>>     StringRef MCPU = llvm::sys::getHostCPUName();
>>>>     /*
>>>>      * The cpu bits are no longer set automatically, so need to set mcpu manually.
>>>> @@ -498,6 +526,7 @@ lp_build_create_jit_compiler_for_module(LLVMExecutionEngineRef *OutJIT,
>>>>      */
>>>>     builder.setMCPU(MCPU);
>>>>  #endif
>>>> +   builder.setMAttrs(MAttrs);
>>>>  
>>>>     ShaderMemoryManager *MM = new ShaderMemoryManager();
>>>>     *OutCode = MM->getGeneratedCode();
>>>>
>>> I'm not sure if that's really the right idea, it also misses half the
>>> potential features (avx, f16c, ...).
>>> Maybe using llvm::sys::getHostCPUFeatures() and passing that to mattrs
>>> instead of setting the host cpu name would work?
>>>
>> I was unaware that getHostCPUFeatures existed, but I think passing both getHostCPUName and getHostCPUFeatures would work.
>> Unfortunately it seems getHostCPUFeatures only exists on arm right now, but that should be fixable by re-introducing the cpuid stuff in llvm..
So llvm does no longer have any ability to determine the features on its
own at all? All it can is deduce that from the cpu string (which is
quite fishy, IIRC it actually at some point used a "reduced" cpu string
in case of avx not being available despite the cpu supporting it)? What
a PITA. When this came up last time I was told one should use
llvm::sys::getHostCPUName() and llvm::sys::getHostCPUFeatures() to
determine what llvm thinks it can do, and that was only just back in
April (http://llvm.org/bugs/show_bug.cgi?id=19429).
>From a quick look though it seems you are right and
llvm::sys::getHostCPUFeatures() only ever does anything on arm (that's
what I call inconsistent design...).

> Hm no, it would seem it takes a stringmap, probably mapping whether certain features are available or not, and it doesn't create a nice string for passing along.
> 
> Regarding imirkin's comment, yes the mmx thing is a copy paste error, and newer versions of SSE/AVX imply the previous ones are supported.
> I also don't enable all possible cpu extensions, just the ones used by llvmpipe and gallivm.
llvmpipe/gallivm will require both avx and f16c - it will definitely
crash if it thinks they are available but llvm thinks they are not.
Plus, specifying just half the features is a sure recipe for disaster
later when someone decides to emit some more intrinsics...


> This is to make sure there is no mismatch between
> what llvmpipe expects and what llvm can make of it from the cpu string. I don't really care about the optimal code generation, I would just prefer there are no crashes. :-)
I guess specifying all the MAttrs manually is fine then. Like I said
though, I really think we should specify all.

Roland


> 
> ~Maarten
> 



More information about the mesa-dev mailing list