[Pixman] [PATCH 0/2] A couple of ARM improvements
Søren Sandmann Pedersen
sandmann at cs.au.dk
Mon Apr 4 08:33:06 PDT 2011
I got myself an ARM netbook which by default is using subpixel
antialiased text in 565 mode, so fast_composite_over_n_8888_0565_ca()
is showing up on profiles. The following is an implementation of that
operation in NEON assembly along with a tiny improvement to the
8888_8888_ca version, where a single quad-word instruction could be
used instead of two double-word instructions.
- The combine_mask_ca replacement could be shared between 8888_ca and
0565_ca, except that the 565 version can ignore the mask alpha
because it doesn't need compute a destination alpha.
It may make sense to split this code into a new macro that takes an
"opaque_destination" argument. This would allow the same
optimization to be used for x8r8g8b8 destinations.
- Further optimization is possible if we know that the solid source is
opaque, which is very often the case. When it is, there is no need
to multiply it onto the mask. An "opaque_source" argument could be
added to the macro as well and then separate x888 versions of the
function could be generated.
- It might be a win to optimize out destination access whenever the
src * mask is 0 or 1. However, this might require some more involved
changes to the code generation framework.
This is my first attempt at writing NEON assembly, so review is
More information about the Pixman