[Pixman] [PATCH 0/4] Meet the FPU-based implementation of the core pixel pipeline

Tue Sep 21 10:51:32 PDT 2010

Dmitri Vorobiev <dmitri.vorobiev at movial.com> writes:

> > Recently, there was a debate about using floating-point hardware
> > for a Pixman implementation instead of the fixed point code. In
> > our company, we have developed an FPU-based implementation of the
> > core pixel pipeline, including support for all pixel formats and
> > combiners, and I am now working on the resulted code to make the
> > latter upstreamable. The patch series that follows is the first
> > outcome of this work.
> 
> I think it's worth mentioning explicitly that this patch series isn't
> the whole implementation, which we have here. I'm going to post more
> code in the coming days. However, following the "release early,
> release often"-motto, I decided to share now that part that is ready
> at this moment.

First, thanks for doing this work. I think a floating point pixel
pipeline is a very useful thing to have. I also appreciate the patches
going to the list early.

As a practical matter, it would be useful at least for me if you can
publish a git repository containing this work. Realistically, this
floating point pipeline likely won't be ready for 0.20, so there will
be some rebasing necessary. 

A couple of initial comments:

There is already a 64bit pipeline that uses a16r16g16b16 intermediate
pixels; it is used whenever the 10bpc formats are involved. However,
it is also somewhat neglected in that transformations and gradients
don't use it, and it is somewhat slow. If we are going to have a
floating point pipeline, then it's pretty tempting to get rid of the
64bit one and just use the floating point one instead.

You could argue that the the general implementation should simply be
falling back to the floating point one for formats that the general
one couldn't handle. Also, the more cracktastic operators such as the
conjoint/disjoint ones, could be done in floating point, and the 32
bit versions could be deleted.

So basically, I think it would be interesting to think of making
floating point pipe as the new 'canonical' one, deleting the 64 bit
one, and considering the 32 bit one a 'fast path' that can be taken in
some cases. This is just what I happen to be thinking; please feel 

In any case, we'll almost certainly want to accelerate this pipeline
not only with NEON, but also SSE and AVX, so regardless of how it
eventually gets integrated, that's worth keeping in mind. To do this
properly, we'll need to solve the problem of how to install CPU
specific fetchers. 

I hope we can have some discussion on this subject. Don't take the
above as gospel; it's just what I happen to think at the moment.

(Also, a while back, I also did some work on a floating point
pipeline:

        http://cgit.freedesktop.org/~sandmann/pixman/log/?h=floatpipe 

It may or may not be useful).

Thanks,
Soren