[PATCH xf86-video-ati] Replace loop with clz to calculate log base 2 on non-x86 platforms in radeon.h
Michel Dänzer
michel at daenzer.net
Tue Nov 29 07:32:54 UTC 2016
On 29/11/16 03:18 AM, Jochen Rollwagen wrote:
> This commit replaces the loop for calculating log base 2 for
> non-x86-platforms in radeon.h with a clz (count leading zeroes)-based
> version to simplify the code and, well, eliminate the loop.
> Note: There’s no check for val=0 case, since x86-bsr is undefined for
> that case too, that should be okay.
> ---
> src/radeon.h | 7 +++----
> 1 file changed, 3 insertions(+), 4 deletions(-)
>
> diff --git a/src/radeon.h b/src/radeon.h
> index cbc7866..b1a1ce0 100644
> --- a/src/radeon.h
> +++ b/src/radeon.h
> @@ -933,17 +933,16 @@ enum {
> static __inline__ int
> RADEONLog2(int val)
> {
> - int bits;
> #if (defined __i386__ || defined __x86_64__) && (defined __GNUC__)
> + int bits;
> +
> __asm volatile("bsrl %1, %0"
> : "=r" (bits)
> : "c" (val)
> );
> return bits;
> #else
> - for (bits = 0; val != 0; val >>= 1, ++bits)
> - ;
> - return bits - 1;
> + return (31 - __builtin_clz(val));
> #endif
> }
Any reason for not using __builtin_clz on x86 as well? AFAICT both gcc
and clang seem to generate more or less the same code with that as with
the inline assembly.
--
Earthling Michel Dänzer | http://www.amd.com
Libre software enthusiast | Mesa and X developer
More information about the amd-gfx
mailing list