[Pixman] FPU-based implementation of the core pixel pipeline

Tue Oct 5 10:44:39 PDT 2010

If you are using lookup tables to convert floating point to integers, I 
have found that you can use the lower bits to linearly interpolate a 
much smaller number of entries.

You can also eliminate all the negative numbers and all numbers greater 
than 1 and all NAN, making the table 1/4 size.

Eliminating all "too small" numbers and treating them as zero can halve 
the table.

You should be able to use the same table for all bit size results. You 
actually want those extra bits because fp gradients will look a *lot* 
better if you can do some sort of error diffusion.

I am unsure if this is faster if you are just doing linear, but a big 
win of a lookup table is you can also do sRGB conversion at the same time.

Jonathan Morton wrote:
> On Tue, 2010-10-05 at 09:38 +0200, Soeren Sandmann wrote:
>> How must of a performance improvement is that table really over just
>> using regular conversions?
> 
> It's substantial on at least PowerPC and ARM using GCC, for differing
> reasons.  I can't remember whether I saw much of a difference on x86 as
> well.  Generally, moving things between integer and FP worlds is not
> well optimised at the hardware level.
> 
> It strikes me that initialising only the channel bit-depths that are
> actually used, rather than the whole table, would be a substantial
> saving.  The largest depth in the predefined-formats table (10) would
> consume 4KB, the much more common 8bpc table takes just 1KB.
> 
> So when I get some time, I'll have another look at this.
>