[PATCH xf86-video-ati] Replace loop with clz to calculate log base 2 on non-x86 platforms in radeon.h

Michel Dänzer michel at daenzer.net
Tue Nov 29 07:32:54 UTC 2016


On 29/11/16 03:18 AM, Jochen Rollwagen wrote:
> This commit replaces the loop for calculating log base 2 for
> non-x86-platforms in radeon.h with a clz (count leading zeroes)-based
> version to simplify the code and, well, eliminate the loop.
> Note: There’s no check for val=0 case, since x86-bsr is undefined for
> that case too, that should be okay.
> ---
>  src/radeon.h |    7 +++----
>  1 file changed, 3 insertions(+), 4 deletions(-)
> 
> diff --git a/src/radeon.h b/src/radeon.h
> index cbc7866..b1a1ce0 100644
> --- a/src/radeon.h
> +++ b/src/radeon.h
> @@ -933,17 +933,16 @@ enum {
>  static __inline__ int
>  RADEONLog2(int val)
>  {
> -    int bits;
>  #if (defined __i386__ || defined __x86_64__) && (defined __GNUC__)
> +    int bits;
> +
>      __asm volatile("bsrl    %1, %0"
>          : "=r" (bits)
>          : "c" (val)
>      );
>      return bits;
>  #else
> -    for (bits = 0; val != 0; val >>= 1, ++bits)
> -        ;
> -    return bits - 1;
> +    return (31 - __builtin_clz(val));
>  #endif
>  }

Any reason for not using __builtin_clz on x86 as well? AFAICT both gcc
and clang seem to generate more or less the same code with that as with
the inline assembly.


-- 
Earthling Michel Dänzer               |               http://www.amd.com
Libre software enthusiast             |             Mesa and X developer


More information about the amd-gfx mailing list