[Pixman] [PATCH 7/7] ARM: NEON optimization for bilinear scaled 'src_8888_8888'
siarhei.siamashka at gmail.com
Thu Feb 24 16:11:46 PST 2011
On Tuesday 22 February 2011 23:23:48 you wrote:
> From: Siarhei Siamashka <siarhei.siamashka at nokia.com>
> Initial NEON optimization for bilinear scaling. Can be probably
> improved more.
> Benchmark on ARM Cortex-A8:
> Microbenchmark (scaling 2000x2000 image with scale factor close to 1x):
> before: op=1, src=20028888, dst=20028888, speed=10.72 MPix/s
> after: op=1, src=20028888, dst=20028888, speed=44.27 MPix/s
And indeed, just adding prefetch to bilinear scaling code actually even
provides something like 1.5x better performance than that. I'll try to make
a separate patch adding prefetch after testing how well it performs for
different scale factors.
It's interesting that prefetch was not actually helping in the nearest
scaling case, probably because LSU was already overloaded with handling
many scattered memory accesses (or maybe because I did something wrong
that time). In any case, because bilinear scaling also has a number
crunching part, adding prefetch really improves memory bandwidth
utilization and provides a nice performance boost.
More information about the Pixman