[Mesa-dev] [PATCH] Use signbit() in IS_NEGATIVE and DIFFERENT_SIGNS

Mon Sep 17 19:15:24 PDT 2012

Am 18.09.2012 01:34, schrieb Matt Turner:
> On Mon, Sep 17, 2012 at 1:20 PM, Jose Fonseca <jfonseca at vmware.com> wrote:
>>
>>
>> ----- Original Message -----
>>> signbit() appears to be available everywhere (even MSVC according to
>>> MSDN), so let's use it instead of open-coding some messy and
>>> confusing
>>> bit twiddling macros.
>>>
>>> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=54805
>>> Cc: Alan Coopersmith <alan.coopersmith at oracle.com>
>>> Suggested-by: Ian Romanick <ian.d.romanick at intel.com>
>>> ---
>>> I'd prefer to see if we can use this everywhere before we decide to
>>> keep
>>> the open-coded macros around for other platforms.
>>>
>>> Alan, does this work for you on Solaris?
>>>
>>>  configure.ac           |    7 +++++++
>>>  src/mesa/main/macros.h |   21 ++-------------------
>>>  2 files changed, 9 insertions(+), 19 deletions(-)
>>>
>>> diff --git a/configure.ac b/configure.ac
>>> index 4193496..f23bfb8 100644
>>> --- a/configure.ac
>>> +++ b/configure.ac
>>> @@ -499,6 +499,13 @@ AC_SUBST([DLOPEN_LIBS])
>>>  dnl See if posix_memalign is available
>>>  AC_CHECK_FUNC([posix_memalign], [DEFINES="$DEFINES
>>>  -DHAVE_POSIX_MEMALIGN"])
>>>
>>> +dnl signbit() is a macro in glibc's math.h, so AC_CHECK_FUNC fails.
>>> To handle
>>> +dnl this, use AC_CHECK_DECLS and fallback to AC_CHECK_FUNC in case
>>> it fails.
>>> +AC_CHECK_DECLS([signbit],[],
>>> +               AC_CHECK_FUNC([signbit],[],
>>> +                             AC_MSG_ERROR([could not find
>>> signbit()])),
>>> +               [#include <math.h>])
>>> +
>>>  dnl SELinux awareness.
>>>  AC_ARG_ENABLE([selinux],
>>>      [AS_HELP_STRING([--enable-selinux],
>>> diff --git a/src/mesa/main/macros.h b/src/mesa/main/macros.h
>>> index 04d59d7..7b7fd1b 100644
>>> --- a/src/mesa/main/macros.h
>>> +++ b/src/mesa/main/macros.h
>>> @@ -693,31 +693,14 @@ NORMALIZE_3FV(GLfloat v[3])
>>>  static inline GLboolean
>>>  IS_NEGATIVE(float x)
>>>  {
>>> -#if defined(USE_IEEE)
>>> -   fi_type fi;
>>> -   fi.f = x;
>>> -   return fi.i < 0;
>>> -#else
>>> -   return x < 0.0F;
>>> -#endif
>>> +   return signbit(x) != 0;
>>>  }
>>
>> I'd vote by simply using "x < 0.0F". A modern compiler should generate the best code for that instruction already.
> 
> So it turns out that comparing with < 0.0F is much worse than the
> current get-the-sign-bit and compare algorithm. I haven't pasted the
> code from it below because it's significantly longer.
> 
> Here is the assembly output of the other two (IEEE being the fi_type
> union hack, and SIGNBIT using signbit().
> 
> <IS_NEGATIVE_IEEE>:
>     movd   %xmm0,%eax
>     shr    $0x1f,%eax
> 
> <IS_NEGATIVE_SIGNBIT>:
>     pmovmskb %xmm0,%eax
>     shr    $0x3,%eax
>     and    $0x1,%eax
What optimization level was that?
I find it interesting that it used pmovmskb/shr/and. Would be much more
natural imho (and better) to juse use movmskps/and if it wants to
extract the sign bit "directly" (which would probably be roughly as good
as the union hack well there might always be some odd cpus out there
which don't have the same performance for pmovmskb and movmskps but
generally it should be the same).
The result might look less pretty though when compiling on plain 32-bit
x86 (that is, using fp stack) in any case.
Is there actually a platform where we wouldn't have USE_IEEE?

> 
> <DIFFERENT_SIGNS_IEEE>:
>     movd   %xmm0,%eax
>     movd   %xmm1,%edx
>     xor    %edx,%eax
>     shr    $0x1f,%eax
> 
> <DIFFERENT_SIGNS_SIGNBIT>:
>     pmovmskb %xmm0,%edx
>     pmovmskb %xmm1,%eax
>     and    $0x8,%edx
>     and    $0x8,%eax
>     cmp    %eax,%edx
>     setne  %al
> 
> So the fi_type union hack is actually better, although it is a
> strict-aliasing violation if we ever decide to care about that..
Yeah but
xor %xmm1, %xmm0
movmskps %xmm0, %eax
and $0x1, %eax
would be even better :-).
I think use of such unions though is pretty much guaranteed and
compilers will have to support this for forever.

> 
>> Furthermore signbit(-0.0f) != signbit(0.0f)
>>
>>>  /** Test two floats have opposite signs */
>>>  static inline GLboolean
>>>  DIFFERENT_SIGNS(GLfloat x, GLfloat y)
>>>  {
>>> -#if defined(USE_IEEE)
>>> -   fi_type xfi, yfi;
>>> -   xfi.f = x;
>>> -   yfi.f = y;
>>> -   return !!((xfi.i ^ yfi.i) & (1u << 31));
>>> -#else
>>> -   /* Could just use (x*y<0) except for the flatshading
>>> requirements.
>>> -    * Maybe there's a better way?
>>> -    */
>>> -   return ((x) * (y) <= 0.0F && (x) - (y) != 0.0F);
>>> -#endif
>>> +   return signbit(x) != signbit(y);
>>
>> This is not the same thing: DIFFERENT_SIGNS(-0.0f, 0.0f) was returning false, and now it will return TRUE.
>>
>> For me, replacing the current code with signbit seems a waste of time. I'd much prefer leave it alone (given it is known to work as is), or just use normal floating point operators.
> 
> It's not known to work as-is. See the bug report. Truthfully, I have
> not tested my patch on i915 (where the bug is relevant), so mine may
> be broken as well.
> 
> I understand Ian's concern given the subtle nature of this code.
> Brian's commit (c8a86f717 - converting from macros to inline
> functions) caused bug 54365, and then his fix (23cd6c43da) caused bug
> 54805. Neither commit looked like it should cause any problems.
> 
> I think we'd probably trade slightly fewer instructions for more
> understandable and maintainable code.

I agree whatever looks understandable and still generates at least
somewhat reasonable code is probably best.

Roland