[Pixman] [PATCH 0/4] Meet the FPU-based implementation of the core pixel pipeline

Thu Sep 23 14:45:21 PDT 2010

Jonathan Morton <jonathan.morton at movial.com> writes:

> Actually, I took the trouble to optimise some of the PDF operators
> while converting them, with the intended side-effect of also making
> them more readable.  While doing so, I realised that component-alpha
> implementation of even the HSL filters was quite reasonable - just
> calculate the resultant colour independently of the alpha, and mask
> off the result in the same way as the other PDF operators.  This
> neatly fills in a generality hole.

Right, leaving those combinations out was probably a mistake.

> > In any case, we'll almost certainly want to accelerate this pipeline
> > not only with NEON, but also SSE and AVX, so regardless of how it
> > eventually gets integrated, that's worth keeping in mind. To do this
> > properly, we'll need to solve the problem of how to install CPU
> > specific fetchers.
> 
> This may be a question of whether, on a specific CPU, using a
> vector-int-to-float conversion is faster than the three or four table
> lookups as implemented here.  A scalar int-to-float conversion is
> almost certainly slower.

I certainly hope that the solution in the patch we've seen so far is
not the final one, because it looks to me like it allocates half a
megabyte *per process*, just for conversion tables.

Even half a megabyte statically allocated in the binary is probably
too much.

Soren