[Mesa-dev] Optimize logbase2() function

Matt Turner mattst88 at gmail.com
Fri Jun 3 09:30:08 PDT 2011


On Fri, Jun 3, 2011 at 10:44 AM, Roland Scheidegger <sroland at vmware.com> wrote:
> Am 02.06.2011 14:43, schrieb Benjamin Bellec:
>> Hello,
>>
>> I performed several tests of the logbase2() function.
>> This function is defined and used in these files:
>
> btw you could probably make it faster if you'd just use the x86 BSR
> instruction - at least newer intel cpus handle that with a throughput of
> 1 per clock... (though you'd need special case for 0 since it's
> undefined otherwise).
> I don't think there's any portable way to take advantage of that
> instruction however.
> It shouldn't be in a performance critical path however, so any decent
> portable implementation should do (FWIW you could replace the the +=
> with |= but for newer cpus it most likely doesn't make a difference).
>
> Roland

With gcc you can do

1 << (32 - __builtin_clz(n - 1))

which will use BSR on x86 and the equivalent instruction on other
architectures. I'd suppose using this would give gcc better semantic
information even on architectures that don't have a single instruction
for this.

The only thing you have to worry about is, as you say, that BSR
doesn't have a defined result for 0, but that's probably not valid
input anyway.

Matt


More information about the mesa-dev mailing list