[Pixman] [PATCH 0/4] ARM: REPEAT_NORMAL support for standard fast paths
Taekyun Kim
podain77 at gmail.com
Thu Jul 7 22:18:00 PDT 2011
Hi all,
Current standard fast paths do not support REPEAT_NORMAL causing such
paths being on top of profile result of some popular web sites. They
usually use REPEAT_NORMAL to draw some tiled background or gradient.
ARM composite functions are wrapped with common macro templates so we
can easily add REPEAT_NORMAL features by giving height of 1 and using
them as scanline functions.
STD fast paths usually load multiple pixels with something like...
vld1.32 {d0-d4}, [SRC]!
So it is not easy to do REPEAT_NORMAL handling inside of them which
was relatively simple for nearest scaling.
For sse2, I couldn't find a good and simple way to do this. Maybe we
can modify sse2 to use macro templates just like the way ARM did.
Previous STD paths was supportting only SAMPLES_COVER_CLIP cases.
My patch does not take advantages of pre-computed flags, I'm a bit
worried about this. Maybe flags can be added to macro templates so
that we can make code to take desired path at compile-time.
And here're the benchmark results on S5PC110.
I've got best numbers with REPEAT_NORMAL_MIN_WIDTH = 32.
(By the way, I had to give CFLAGS=-O2 option explicitly)
Benchmark for various repeat mode image composition
///////////////////////////////////////////////////////////////////
// op=SRC, src=a8r8g8b8, mask=None, dst=a8r8g8b8
///////////////////////////////////////////////////////////////////
<< Reference Compositing Performance 2000x2000 to 2000x2000 >>
Non-scaled : 154.64 Mpix/s
<< src = 1 x 512 dst = 512 x 512 >>
- Non-scaled -
NORMAL : 15.94 Mpix/s (before)
NORMAL : 277.05 Mpix/s (after)
<< src = 15 x 15 dst = 512 x 512 >>
- Non-scaled -
NORMAL : 105.78 Mpix/s (before)
NORMAL : 281.32 Mpix/s (after)
<< src = 63 x 63 dst = 512 x 512 >>
- Non-scaled -
NORMAL : 180.84 Mpix/s (before)
NORMAL : 337.55 Mpix/s (after)
//////////////////////////////////////////////////////////////////
// op=OVER, src=a8r8g8b8, mask=None, dst=a8r8g8b8
//////////////////////////////////////////////////////////////////
<< Reference Compositing Performance 2000x2000 to 2000x2000 >>
Non-scaled : 89.71 Mpix/s
<< src = 1 x 512 dst = 512 x 512 >>
- Non-scaled -
NORMAL : 14.39 Mpix/s (before)
NORMAL : 115.86 Mpix/s (after)
<< src = 15 x 15 dst = 512 x 512 >>
- Non-scaled -
NORMAL : 64.69 Mpix/s (before)
NORMAL : 113.66 Mpix/s (after)
<< src = 63 x 63 dst = 512 x 512 >>
- Non-scaled -
NORMAL : 84.77 Mpix/s (before)
NORMAL : 122.88 Mpix/s (after)
Taekyun Kim (4):
ARM: REPEAT_NORMAL support for identity transform
ARM: NEON Enable REPEAT_NORMAL support for non-scaled fast paths
ARM: SIMD Enable REPEAT_NORMAL support for non-scaled fast paths
ARM: Source scanline extension for non-scaled REPEAT_NORMAL
pixman/pixman-arm-common.h | 339 ++++++++++++++++++++++++++++++++++++++++----
pixman/pixman-arm-neon.c | 166 +++++++++++-----------
pixman/pixman-arm-simd.c | 30 ++--
3 files changed, 412 insertions(+), 123 deletions(-)
More information about the Pixman
mailing list