[PATCH] use magic wl_fixed_t to/from double only on x86-64

Wed May 16 06:09:35 PDT 2012

> on x86:
> benchmarked magic:      14.048889885s
> benchmarked div:        5.426952392s
> benchmarked mul:        4.034106976s
>
> on x86-64:
> benchmarked magic:      2.467789582s
> benchmarked div:        9.748067755s
> benchmarked mul:        8.665307997s
> Did you compile your 32-bit code with -mfpmath=sse? If not, could you try and 
> post the results again? I'd be quite surprised if it turned out that the x87 
> operations are faster than the SSE ones, but that's what your numbers show.
It was compiled with following flags:
on x86: -march=i686 -mtune=generic -O2
on x86-64: -march=x86-64 -mtune=generic -O2

As you asked, benchmark on 32-bit with -march=native -mfpmath=sse -O2

benchmarked magic:    16.204160542s
benchmarked div:    9.719736771s
benchmarked mul:    8.638401181s

Slow SSE math is probably gcc fault. With clang on 32-bits i got this
numbers:

benchmarked magic:    19.441825239s
benchmarked div:    5.493691053s
benchmarked mul:    3.238189342s

But at first look code does not contain x87 opcodes. (clang doesn't
understand -mfpmath=sse)