[Pixman] [PATCH 0/4] ARM: REPEAT_NORMAL support for standard fast paths

Taekyun Kim podain77 at gmail.com
Thu Jul 7 22:18:00 PDT 2011


Hi all,

Current standard fast paths do not support REPEAT_NORMAL causing such
paths being on top of profile result of some popular web sites. They
usually use REPEAT_NORMAL to draw some tiled background or gradient.

ARM composite functions are wrapped with common macro templates so we
can easily add REPEAT_NORMAL features by giving height of 1 and using
them as scanline functions.

STD fast paths usually load multiple pixels with something like...

vld1.32	{d0-d4}, [SRC]!

So it is not easy to do REPEAT_NORMAL handling inside of them which
was relatively simple for nearest scaling.

For sse2, I couldn't find a good and simple way to do this. Maybe we
can modify sse2 to use macro templates just like the way ARM did.

Previous STD paths was supportting only SAMPLES_COVER_CLIP cases.
My patch does not take advantages of pre-computed flags, I'm a bit
worried about this. Maybe flags can be added to macro templates so
that we can make code to take desired path at compile-time.

And here're the benchmark results on S5PC110.
I've got best numbers with REPEAT_NORMAL_MIN_WIDTH = 32.
(By the way, I had to give CFLAGS=-O2 option explicitly)

Benchmark for various repeat mode image composition
///////////////////////////////////////////////////////////////////
// op=SRC, src=a8r8g8b8, mask=None, dst=a8r8g8b8
///////////////////////////////////////////////////////////////////
<< Reference Compositing Performance 2000x2000 to 2000x2000 >>
Non-scaled      : 154.64 Mpix/s

<< src = 1 x 512  dst = 512 x 512 >>
- Non-scaled      -   
NORMAL  : 15.94 Mpix/s (before)
NORMAL  : 277.05 Mpix/s (after)

<< src = 15 x 15  dst = 512 x 512 >>
- Non-scaled      -   
NORMAL  : 105.78 Mpix/s (before)
NORMAL  : 281.32 Mpix/s (after)

<< src = 63 x 63  dst = 512 x 512 >>
- Non-scaled      -   
NORMAL  : 180.84 Mpix/s (before)
NORMAL  : 337.55 Mpix/s (after)

//////////////////////////////////////////////////////////////////
// op=OVER, src=a8r8g8b8, mask=None, dst=a8r8g8b8
//////////////////////////////////////////////////////////////////
<< Reference Compositing Performance 2000x2000 to 2000x2000 >>
Non-scaled      : 89.71 Mpix/s

<< src = 1 x 512  dst = 512 x 512 >>
- Non-scaled      -   
NORMAL  : 14.39 Mpix/s (before)
NORMAL  : 115.86 Mpix/s (after)

<< src = 15 x 15  dst = 512 x 512 >>
- Non-scaled      -   
NORMAL  : 64.69 Mpix/s (before)
NORMAL  : 113.66 Mpix/s (after)

<< src = 63 x 63  dst = 512 x 512 >>
- Non-scaled      -   
NORMAL  : 84.77 Mpix/s (before)
NORMAL  : 122.88 Mpix/s (after)


Taekyun Kim (4):
  ARM: REPEAT_NORMAL support for identity transform
  ARM: NEON Enable REPEAT_NORMAL support for non-scaled fast paths
  ARM: SIMD Enable REPEAT_NORMAL support for non-scaled fast paths
  ARM: Source scanline extension for non-scaled REPEAT_NORMAL

 pixman/pixman-arm-common.h |  339 ++++++++++++++++++++++++++++++++++++++++----
 pixman/pixman-arm-neon.c   |  166 +++++++++++-----------
 pixman/pixman-arm-simd.c   |   30 ++--
 3 files changed, 412 insertions(+), 123 deletions(-)



More information about the Pixman mailing list