[Pixman] [PATCH 3/4] sse2: affine bilinear fetcher

Thu Jan 31 10:41:47 PST 2013

Siarhei Siamashka <siarhei.siamashka at gmail.com> writes:

> As for the affine transforms, they really depend on accessing memory
> in an a cache-friendly way.

A simple experiment that could be done would be to just switch to a
tiled access pattern in pixman-general.c and see what the performance
impact of that would be.

> I would say that we probably need to generalize iterators to work with
> arbitrary rectangles in the destination space instead of the
> scanlines. The optimal width/height of these rectangles can be
> heuristically selected based on the transformation matrix and the
> sizes of L1/L2 caches and TLB. In the case if the height of the
> rectangle is 1 pixel and the width is the same as the destination
> width, we get the current scanline based behaviour. The code which is
> going to perfectly fit the rectangles based model and immediately
> benefit from it are the fast paths for simple 90/180/270 rotation:

Regarding generalizing to arbitrary rectangles, an interesting thought
is that all the standard blitters are really nothing more than combiners
that support arbitrary rectangles/strides. If the iterators were further
generalized so that they could fetch more formats than just a8r8g8b8 and
floating point, the standard fast paths could be considered just "noop
iterators plus rectangular combiner" and folded in under the iterator
system.

If the iterator system and the fast path system could be unified, that
might help with the problem discussed here:

     http://permalink.gmane.org/gmane.comp.graphics.pixman/900

which with the iterator activity is now turning into a more serious
issue.

> One more thing that I wonder about is whether we can already drop 8-bit
> interpolation precision for bilinear scaling or should keep it for a
> while?

I'm fine with dropping support for 8 bit if that's useful.

Søren