[cairo] [PATCH/RFC] pixman ARM NEON optimizations (now about performance)

Soeren Sandmann sandmann at daimi.au.dk
Tue Jul 28 00:51:03 PDT 2009


Siarhei Siamashka <siarhei.siamashka at gmail.com> writes:

> Some quick ARM related notes:
> 1. Pixman dispatch logic overhead is quite high. I added an ugly hack
> to the benchmark program to lookup NEON fastpath tables and also make calls
> to the blitters directly. For the L1 test (running completely in L1 cache
> to benchmarking inner loops), the overhead of pixman code varies between 30%
> and more than 2x slowdown. Surely pixman does some extra necessary stuff like
> clipping boundary checks. But parts like the linear search in fastpath tables
> are probably not the best for the performance (and this is going to get worse
> when more optimized functions get added to the tables). This needs better
> investigation though to see where most of the time is spent.

Micro benchmarks like these are useful because they tell us how close
the fast paths are to the memory bandwidth and therefore give a sense
of how close to optimal they are. 

However, it is important to realize that speedups on a micro benchmark
do not translate directly into user visible performance
improvements. Clearly, by making the images small enough, one can make
the dispatch overhead arbitrarily big.

I do think that the total overhead of glyph rendering, including
everything that goes on in the X server and in pixman is too large,
but we'll need a separate benchmark to evaluate exactly how much that
overhead is.



Soren


More information about the cairo mailing list