[Mesa-dev] [PATCH 1/5] configure.ac: Detect if running on POWER8 arch

Oded Gabbay oded.gabbay at gmail.com
Sun Jan 3 10:20:30 PST 2016


On Sun, Jan 3, 2016 at 8:14 PM, Roland Scheidegger <sroland at vmware.com> wrote:
> Am 03.01.2016 um 13:50 schrieb Oded Gabbay:
>> On Thu, Dec 31, 2015 at 8:48 PM, Roland Scheidegger <sroland at vmware.com> wrote:
>>> Am 31.12.2015 um 10:30 schrieb Oded Gabbay:
>>>> On Wed, Dec 30, 2015 at 5:41 PM, Roland Scheidegger <sroland at vmware.com> wrote:
>>>>> Am 30.12.2015 um 10:46 schrieb Oded Gabbay:
>>>>>> On Wed, Dec 30, 2015 at 1:11 AM, Roland Scheidegger <sroland at vmware.com> wrote:
>>>>>>>
>>>>>>> So, if I see that right, you will automatically generate binaries using
>>>>>>> power8 instructions if compiled on power8 capable box, which then won't
>>>>>>> run on boxes not supporting power8? Is that really what you want?
>>>>>>> Maybe some runtime detection would be a good idea (though I don't know
>>>>>>> if anyone cares about power7)?
>>>>>>
>>>>>> The problem is I don't think I can eliminate the build time check
>>>>>> (although I would very much like to) because I need:
>>>>>> 1. To pass a special flag to the GCC compiler: -mpower8-vector
>>>>>> 2. To define _ARCH_PWR8 so GCC will include the newer intrinsic
>>>>>>
>>>>>> Without those two things, I won't be able to use vec_vgbbd which I
>>>>>> need to implement the _mm_movemask_epi8 efficiently, and without that,
>>>>>> all this patch series can be thrown out the window. The emulation of
>>>>>> _mm_movemask_epi8 using regular instructions is just horrible.
>>>>>>
>>>>>> You are correct that once you build a binary with this flag on power8
>>>>>> machine, that binary won't run on power7 machine. You get "cannot
>>>>>> execute binary file"
>>>>>>
>>>>>> Unfortunately, I don't see a way around this because even if I
>>>>>> condition the use of vec_vgbbd on a runtime check/define, the library
>>>>>> still won't be executable because it was built with -mpower8-vector.
>>>>>>
>>>>>> Having said that, because I *assume* IBM right now mostly cares about
>>>>>> Linux running on POWER8 with little-endian, I think it is a fair
>>>>>> compromise.
>>>>>
>>>>> Note I don't have anything against a build time check. My concern here
>>>>> is something along the lines of unsuspecting distros shipping binaries
>>>>> which won't work, as it looks to me like this will get picked up
>>>>> automatically. That is different to how for instance sse41 is handled.
>>>>> That is I believe this should only get enabled if someone has specified
>>>>> some -mcpu=power8 or whatever flag explicitly somewhere already.
>>>>>
>>>>> Roland
>>>>
>>>> I understand and I share your concern. Maybe we should add
>>>> "--disable-pwr8-inst" to mesa's configure ? if that flag is given to
>>>> configure, it would disable the optimization code (won't add
>>>> _ARCH_PWR8 to defines and won't add -mpower8-vector to gcc flags).
>>>>
>>>> What do you think ?
>>> If the generated code with all automatically picked up compile options
>>> really doesn't run on power7 just because of this, I think it would be
>>> nicer if this were an explicit enable.
>>>
>>> Roland
>>>
>>
>> So the problem is a bit worse then that and requires a harsher solution.
>> Apparently, when that compiler flag (power8-vector) is given to GCC,
>> GCC uses POWER8-only instructions in other places as well! What I have
>> seen so far, is that it uses such instructions in the implementation
>> of exp2() and/or log2() (in f.cpp) and also saw it in
>> __glXInitializeVisualConfigFromTags(). The instructions used are not
>> vector instructions, but floating point instructions, which were added
>> only in PowerISA 2.07
>>
>> Therefore, I think that for now, I will limit the entire optimization
>> code to POWER8 *and* Little-Endian. Because ppc64le packages can
>> *only* run on POWER8 systems, and because you can't transfer binaries
>> between LE and BE machines, this workaround eliminates the danger of
>> crashing on "illegal instruction". In addition, there is no more need
>> for runtime checks.
>>
>> I hope you agree that with this change, it is better to enable the
>> power8-vector by default when building on POWER8 machine installed
>> with Linux LE. For all other archs it will be disabled by default.
>
> Yes, that looks reasonable.

Thanks.

>
>> I will try to contact IBM GCC devs to see how we can overcome this
>> problem (or if they even care) so I can expand these optimizations to
>> BE as well.
> IIRC this is problematic for things like sse41 etc. as well, it is just
> how gcc works. I think the typical workaround is to move code using
> intrinsics which need special compile flags to their own file (or rather
> compile unit), which you then can compile with those flags. (This
> implies of course separate functions, that is you can't have any
> run-time check plus the assembly in the same file or function.)
> This is how mesa handles _mesa_streaming_load_memcpy for the i965 driver.
>
> Roland
>
Yeah, I imagined as much, but I didn't know this technique was already
in use in mesa.

I hate this fragmentation of code, and I think a better solution, at
the compiler level, is to have some kind of flag which tells the
compiler he can accept power8/sse41 inline assembly and/or intrinsics
but can't generate those on his own (I'm not a compiler expert but I
think that's doable).

As I said, I'll try to dig around and see what can be done.

Oded

>
>>
>> I will send the revised patches shortly.
>>
>>    Oded
>>
>>>
>>>
>>>
>>>>
>>>> Oded
>>>>
>>>>>
>>>>>>
>>>>>> Oded
>>>>>>
>>>>>>> So far we didn't bother with that for SSE
>>>>>>> but it has to be said SSE2 is a really low bar (and the manual assembly
>>>>>>> stuff doesn't use anything more advanced, even though clearly things
>>>>>>> like the emulated mm_mullo_epi32 are suboptimal if your cpu supports
>>>>>>> sse41). And even then on non-x86 you actually might not get
>>>>>>> PIPE_ARCH_SSE if you didn't set gcc's compile flags accordingly.
>>>>>>>
>>>>>>> Roland
>>>>>>>
>>>>>>>
>>>>>>> Am 29.12.2015 um 17:12 schrieb Oded Gabbay:
>>>>>>>> To determine if we could use special POWER8 assembly directives, we first
>>>>>>>> need to detect whether we are running on POWER8 architecture. This patch
>>>>>>>> adds this detection to configure.ac and adds the necessary compilation
>>>>>>>> flags accordingly.
>>>>>>>>
>>>>>>>> Signed-off-by: Oded Gabbay <oded.gabbay at gmail.com>
>>>>>>>> ---
>>>>>>>>  configure.ac | 30 ++++++++++++++++++++++++++++++
>>>>>>>>  1 file changed, 30 insertions(+)
>>>>>>>>
>>>>>>>> diff --git a/configure.ac b/configure.ac
>>>>>>>> index f8a70be..1acd47e 100644
>>>>>>>> --- a/configure.ac
>>>>>>>> +++ b/configure.ac
>>>>>>>> @@ -396,6 +396,36 @@ fi
>>>>>>>>  AM_CONDITIONAL([SSE41_SUPPORTED], [test x$SSE41_SUPPORTED = x1])
>>>>>>>>  AC_SUBST([SSE41_CFLAGS], $SSE41_CFLAGS)
>>>>>>>>
>>>>>>>> +dnl Check for POWER8 Architecture
>>>>>>>> +PWR8_CFLAGS="-mpower8-vector"
>>>>>>>> +have_pwr8_intrinsics=no
>>>>>>>> +AC_MSG_CHECKING(whether we are running on POWER8 Architecture)
>>>>>>>> +save_CFLAGS=$CFLAGS
>>>>>>>> +CFLAGS="$PWR8_CFLAGS $CFLAGS"
>>>>>>>> +AC_COMPILE_IFELSE([AC_LANG_SOURCE([[
>>>>>>>> +#if defined(__GNUC__) && (__GNUC__ < 4 || (__GNUC__ == 4 && __GNUC_MINOR__ < 8))
>>>>>>>> +#error "Need GCC >= 4.8 for sane POWER8 support"
>>>>>>>> +#endif
>>>>>>>> +#include <altivec.h>
>>>>>>>> +int main () {
>>>>>>>> +    vector unsigned char r;
>>>>>>>> +    vector unsigned int v = vec_splat_u32 (1);
>>>>>>>> +    r = __builtin_vec_vgbbd ((vector unsigned char) v);
>>>>>>>> +    return 0;
>>>>>>>> +}]])], have_pwr8_intrinsics=yes)
>>>>>>>> +CFLAGS=$save_CFLAGS
>>>>>>>> +
>>>>>>>> +if test $have_pwr8_intrinsics = yes ; then
>>>>>>>> +   DEFINES="$DEFINES -D_ARCH_PWR8"
>>>>>>>> +   CXXFLAGS="$CXXFLAGS $PWR8_CFLAGS"
>>>>>>>> +   CFLAGS="$CFLAGS $PWR8_CFLAGS"
>>>>>>>> +else
>>>>>>>> +   PWR8_CFLAGS=
>>>>>>>> +fi
>>>>>>>> +
>>>>>>>> +AC_MSG_RESULT($have_pwr8_intrinsics)
>>>>>>>> +AC_SUBST([PWR8_CFLAGS], $PWR8_CFLAGS)
>>>>>>>> +
>>>>>>>>  dnl Can't have static and shared libraries, default to static if user
>>>>>>>>  dnl explicitly requested. If both disabled, set to static since shared
>>>>>>>>  dnl was explicitly requested.
>>>>>>>>
>>>>>>>
>>>>>
>>>
>


More information about the mesa-dev mailing list