[Pixman] fast-scale branch performance improvements
siarhei.siamashka at gmail.com
Tue Mar 16 08:45:13 PDT 2010
On Tuesday 16 March 2010, Alexander Larsson wrote:
> On Tue, 2010-03-16 at 16:51 +0200, Siarhei Siamashka wrote:
> > Regarding the alex's branch and performance, I already mentioned that
> > it was
> > much slower for over_8888_0565 case in my benchmark when compared
> > against my
> > branch on ARM Cortex-A8 (the other cases of scaling are ok). I'm using
> > the
> > following test program for benchmarking these optimizations:
> > http://cgit.freedesktop.org/~siamashka/pixman/commit/?h=test-n-bench&i
> > d=93ec60149cb3535f70a9e285de0b359ff444f26e
> > The test program tries to benchmark scaling of when source and
> > destination
> > image sizes are approximately the same (and the performance can be
> > more or
> > less directly compared to the simple nonscaled blit).
> > The results are (variance is only in the last digit):
> > op=3, src_fmt=20028888, dst_fmt=10020565, speed=5.06 MPix/s (1.21 FPS)
> > vs.
> > op=3, src_fmt=20028888, dst_fmt=10020565, speed=8.72 MPix/s (2.08 FPS)
> > which is quite a lot.
> Can you retry with my new branch:
Now it is:
op=3, src_fmt=20028888, dst_fmt=10020565, speed=5.16 MPix/s (1.23 FPS)
A little bit better, but still not good.
But it does not matter much actually, I can check what needs to be fixed
later. The most important is to get this stuff committed and make sure that
there are no bugs/overflows and the flags are evaluated correctly.
> #define CONVERT_0565_TO_8888(s) (CONVERT_0565_TO_0888(s) | 0xff000000)
The '| 0xff000000' part here is useless (for the set of operation that are
supported), but does not have any measurable impact on performance. It is
something that I checked first.
More information about the Pixman