[Pixman] fast-scale branch performance improvements
Alexander Larsson
alexl at redhat.com
Tue Mar 16 11:22:17 PDT 2010
On Tue, 2010-03-16 at 20:17 +0200, Siarhei Siamashka wrote:
> On Tuesday 16 March 2010, Siarhei Siamashka wrote:
> > On Tuesday 16 March 2010, Alexander Larsson wrote:
> > > On Tue, 2010-03-16 at 16:51 +0200, Siarhei Siamashka wrote:
> > > > Regarding the alex's branch and performance, I already mentioned
> that
> > > > it was
> > > > much slower for over_8888_0565 case in my benchmark when
> compared
> > > > against my
> > > > branch on ARM Cortex-A8 (the other cases of scaling are ok). I'm
> using
> > > > the
> > > > following test program for benchmarking these optimizations:
> > > >
> http://cgit.freedesktop.org/~siamashka/pixman/commit/?h=test-n-bench&i
> > > > d=93ec60149cb3535f70a9e285de0b359ff444f26e
> > > >
> > > > The test program tries to benchmark scaling of when source and
> > > > destination
> > > > image sizes are approximately the same (and the performance can
> be
> > > > more or
> > > > less directly compared to the simple nonscaled blit).
> > > >
> > > > The results are (variance is only in the last digit):
> > > >
> > > > op=3, src_fmt=20028888, dst_fmt=10020565, speed=5.06 MPix/s
> (1.21 FPS)
> > > > vs.
> > > > op=3, src_fmt=20028888, dst_fmt=10020565, speed=8.72 MPix/s
> (2.08 FPS)
> > > >
> > > > which is quite a lot.
> > >
> > > Can you retry with my new branch:
> > > http://cgit.freedesktop.org/~alexl/pixman/log/?h=alex-scaler2
> >
> > Now it is:
> > op=3, src_fmt=20028888, dst_fmt=10020565, speed=5.16 MPix/s (1.23
> FPS)
> >
> > A little bit better, but still not good.
>
> Found the problem, it's here:
> > + SIMPLE_NEAREST_FAST_PATH (OVER, a8b8g8r8, r5g6b5, 8888_565),
> This should have a8r8g8b8 instead of a8b8g8r8. So this fast path just
> was not
> run at all. Once fixed, it shows the expected performance.
>
>
> Also 'alex-scaler2' branch is substantially slower than 'alex-scaler'
> for
> normal repeat:
>
> == nearest tiled SRC (alex-scaler) ==
> op=1, src_fmt=20028888, dst_fmt=20028888, speed=90.91 MPix/s (21.67
> FPS)
> op=1, src_fmt=20028888, dst_fmt=10020565, speed=63.82 MPix/s (15.22
> FPS)
> op=1, src_fmt=10020565, dst_fmt=10020565, speed=92.16 MPix/s (21.97
> FPS)
>
> == nearest tiled SRC (alex-scaler2) ==
> op=1, src_fmt=20028888, dst_fmt=20028888, speed=76.54 MPix/s (18.25
> FPS)
> op=1, src_fmt=20028888, dst_fmt=10020565, speed=50.44 MPix/s (12.03
> FPS)
> op=1, src_fmt=10020565, dst_fmt=10020565, speed=67.14 MPix/s (16.01
> FPS)
This may well be the change from open-coding the repeat to using the
repeat() inline function.
--
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Alexander Larsson Red Hat, Inc
alexl at redhat.com alexander.larsson at gmail.com
He's an oversexed amnesiac cop in a wheelchair. She's a virginal Bolivian
fairy princess with the power to see death. They fight crime!
More information about the Pixman
mailing list