[Mesa-dev] [PATCH] gallivm: force sse instructions for llvm 3.5+
Roland Scheidegger
sroland at vmware.com
Thu Oct 2 09:19:48 PDT 2014
Am 02.10.2014 um 17:36 schrieb Jose Fonseca:
> On 02/10/14 15:04, Roland Scheidegger wrote:
>> Am 02.10.2014 um 09:31 schrieb Maarten Lankhorst:
>>> Op 02-10-14 om 09:10 schreef Maarten Lankhorst:
>>>> Hey,
>>>>
>>>> Op 02-10-14 om 04:22 schreef Roland Scheidegger:
>>>>> Am 01.10.2014 16:56, schrieb Maarten Lankhorst:
>>>>>> This fixes a crash when llvmpipe tries to use sse instructions,
>>>>>> but llvm detects a cpu that doesn't support them.
>>>>>>
>>>>>> Fixes for example piglit/bin/amd_seamless_cubemap_per_texture -fbo
>>>>>> -auto
>>>>>> on i386 when run inside "qemu -cpu qemu32", which would otherwise
>>>>>> error with:
>>>>>> "LLVM ERROR: Do not know how to split the result of this operator!"
>>>>>>
>>>>>> Signed-off-by: Maarten Lankhorst <maarten.lankhorst at canonical.com>
>>>>>> ---
>>>>>>
>>>>>> diff --git a/src/gallium/auxiliary/gallivm/lp_bld_misc.cpp
>>>>>> b/src/gallium/auxiliary/gallivm/lp_bld_misc.cpp
>>>>>> index 55aa8b9..f2f8906 100644
>>>>>> --- a/src/gallium/auxiliary/gallivm/lp_bld_misc.cpp
>>>>>> +++ b/src/gallium/auxiliary/gallivm/lp_bld_misc.cpp
>>>>>> @@ -479,10 +479,38 @@
>>>>>> lp_build_create_jit_compiler_for_module(LLVMExecutionEngineRef
>>>>>> *OutJIT,
>>>>>> if (util_cpu_caps.has_f16c) {
>>>>>> MAttrs.push_back("+f16c");
>>>>>> }
>>>>>> - builder.setMAttrs(MAttrs);
>>>>>> }
>>>>>>
>>>>>> #if HAVE_LLVM >= 0x0305
>>>>>> + /*
>>>>>> + * llvm 3.5 no longer supports cpuid based autodetect.
>>>>>> + * This breaks on "qemu -cpu qemu32" which is detected as
>>>>>> pentium2 by llvm's
>>>>>> + * sys::getHostCPUName(), but does support sse2.
>>>>>> + *
>>>>>> + * For this reason force the use of sse extensions when
>>>>>> available, so our
>>>>>> + * understanding of the cpu is in sync with llvm's.
>>>>>> + */
>>>>>> +
>>>>>> + else if (util_cpu_caps.has_sse4_2)
>>>>>> + MAttrs.push_back("+sse42");
>>>>>> + else if (util_cpu_caps.has_sse4_1)
>>>>>> + MAttrs.push_back("+sse41");
>>>>>> + else if (util_cpu_caps.has_ssse3)
>>>>>> + MAttrs.push_back("+ssse3");
>>>>>> + else if (util_cpu_caps.has_sse3)
>>>>>> + MAttrs.push_back("+sse3");
>>>>>> + else if (util_cpu_caps.has_sse2)
>>>>>> + MAttrs.push_back("+sse2");
>>>>>> + else if (util_cpu_caps.has_sse)
>>>>>> + MAttrs.push_back("+sse");
>>>>>> + else if (util_cpu_caps.has_mmx)
>>>>>> + MAttrs.push_back("+sse");
>>>>>> +
>>>>>> + if (util_cpu_caps.has_3dnow_ext)
>>>>>> + MAttrs.push_back("+3dnowa");
>>>>>> + else if (util_cpu_caps.has_3dnow)
>>>>>> + MAttrs.push_back("+3dnow");
>>>>>> +
>>>>>> StringRef MCPU = llvm::sys::getHostCPUName();
>>>>>> /*
>>>>>> * The cpu bits are no longer set automatically, so need to
>>>>>> set mcpu manually.
>>>>>> @@ -498,6 +526,7 @@
>>>>>> lp_build_create_jit_compiler_for_module(LLVMExecutionEngineRef
>>>>>> *OutJIT,
>>>>>> */
>>>>>> builder.setMCPU(MCPU);
>>>>>> #endif
>>>>>> + builder.setMAttrs(MAttrs);
>>>>>>
>>>>>> ShaderMemoryManager *MM = new ShaderMemoryManager();
>>>>>> *OutCode = MM->getGeneratedCode();
>>>>>>
>>>>> I'm not sure if that's really the right idea, it also misses half the
>>>>> potential features (avx, f16c, ...).
>>>>> Maybe using llvm::sys::getHostCPUFeatures() and passing that to mattrs
>>>>> instead of setting the host cpu name would work?
>>>>>
>>>> I was unaware that getHostCPUFeatures existed, but I think passing
>>>> both getHostCPUName and getHostCPUFeatures would work.
>>>> Unfortunately it seems getHostCPUFeatures only exists on arm right
>>>> now, but that should be fixable by re-introducing the cpuid stuff in
>>>> llvm..
>> So llvm does no longer have any ability to determine the features on its
>> own at all? All it can is deduce that from the cpu string (which is
>> quite fishy, IIRC it actually at some point used a "reduced" cpu string
>> in case of avx not being available despite the cpu supporting it)? What
>> a PITA. When this came up last time I was told one should use
>> llvm::sys::getHostCPUName() and llvm::sys::getHostCPUFeatures() to
>> determine what llvm thinks it can do, and that was only just back in
>> April
>> (https://urldefense.proofpoint.com/v1/url?u=http://llvm.org/bugs/show_bug.cgi?id%3D19429&k=oIvRg1%2BdGAgOoM1BIlLLqw%3D%3D%0A&r=NMr9uy2iTjWVixC0wOcYCWEIYhfo80qKwRgdodpoDzA%3D%0A&m=R4H96AFXJHWiMF4WgsKxMokaANWv60welwfnJM53KjQ%3D%0A&s=54e942b2ecaf682213b54be9706a4c7e30d5986f4aaf51008822f47e7df72e6c).
>>
>> From a quick look though it seems you are right and
>> llvm::sys::getHostCPUFeatures() only ever does anything on arm (that's
>> what I call inconsistent design...).
>
> I think that we should avoid relying on cpu name for anything, because
> one can easily run into wierd or unexpected behavior when running old
> LLVM code on new intel CPU architectures.
Well I believe getHostCPUName() actually returned a string always
understood by llvm, based on the actual features. At some point at
least... But it seems this is indeed not really all that useful any
more. That said, cpu name isn't just for features, llvm will do some
other choices based on it, so we probably shouldn't avoid it. Though I
don't know if it's possible to get some string via that mechanism
nowadays which it won't understand then at all...
I was actually missing that at least avx and f16c are still set with
this patch. So this is ok, though I'd still say we should emit the
remaining features too (popcnt, avx2, xop at least).
As for mmx I'm not sure it really makes a difference. Because these
extensions were enabled (at least with some llvm versions) in any case,
even if we didn't explicitly enable them. I think though you are right
there's no point of explicitly enabling them since we don't really want
llvm to use them in any case.
Roland
More information about the mesa-dev
mailing list