[Pixman] [PATCH 1/3] Add CLEAR and SRC linear interpolation operators
Søren Sandmann
sandmann at cs.au.dk
Tue Sep 27 03:51:51 PDT 2011
Chris Wilson <chris at chris-wilson.co.uk> writes:
> Cairo, for instance, has a subtly different interpretation of how to use
> the mask in combination with the Porter-Duff operators. In particular,
> it has the notion of a clip mask, for which pixman has no parallel.
A question I have is, how much of a speedup is this really?
As mentioned earlier,
http://lists.freedesktop.org/archives/cairo/2011-February/021686.html
I have some concerns about adding two special operators that operate in
a totally different way than all the other operators. There are various
places in pixman where we assume that source and mask are never used
independently, for example in the operator optimization table, and in
the general code, where this comment:
"If it doesn't matter what the source is, then it doesn't matter
what the mask is",
would no longer really be true if the LERP operators are added. Nothing
in the patches actually cause these places to malfunction, but they do
make the code base less regular and therefore more difficult to
maintain. If the speedup on cairo traces is big enough, maybe that is
enough to justify it though.
So basically, I'd like to see some performance measurements. With the
patches as posted, the C fast paths will be selected ahead of the
general path, which is not necessarily the fastest:
http://www.mail-archive.com/pixman@lists.freedesktop.org/msg00887.html
Interesting benchmarks include:
- Performance with just the C fast paths
- Performance with just the SSE2 combiner (a fixed one, see below)
- Performance with both
- How much faster is the fastest of the above than with no-LERP?
The SSE2 combiner looks to me like it is missing an expand_alpha(), so
if you are going to do these measurements, it would be useful to add
support for the LERP operators to blitters test as a commit before
adding the fast paths, to verify that they actually work. It may be
useful to add them to some of the other tests as well.
I also suspect that the SSE2 combiner would benefit from the same
optimization that the C fast paths have, where the source/destination
are not read whenever the mask is fully transparent or opaque. There are
some inline functions in pixman-sse2.c (is_opaque() and
is_transparent()) that could be used for this.
Soren
More information about the Pixman
mailing list