[cairo] [PATCH/RFC][pixman] More ARM NEON performance updates

Siarhei Siamashka siarhei.siamashka at gmail.com
Thu Dec 10 07:45:59 PST 2009


1. Addition of ARM optimized combiners (OVER and ADD for the start, more
can be added as needed)


Introduces a simplified template for generating a function for handling
just a single scanline. Call overhead is a bit lower than that of a full
2D image processing function called with 'height' argument set to 1. The
situation with memory prefetch is not quite clear here, so it was dropped for
this case. Combiners may work either with a temporary scratch buffer or with
real memory and benefits of prefetch are mostly invalidated here.

2. Some fetch/store functions (r5g6b5 format is the most interesting) benefit
from SIMD optimizations a lot, at least for ARM NEON:


This is a little bit inconsistent with the other SIMD optimizations which are
handled via pixman_implementation_t. So I'm all open to any suggestions about
how to do it in a right way.

On ARM Cortex-A8, all these optimizations result in ~1.3x performance
improvement for OVER compositing with bilinear transform and 32bpp
destination. For r5g6b5 destination performance improvement is ~1.5x

Best regards,
Siarhei Siamashka
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part.
Url : http://lists.cairographics.org/archives/cairo/attachments/20091210/5f4d8990/attachment.pgp 

More information about the cairo mailing list