[Pixman] [PATCH] ARM: NEON: optimization for bilinear scaled 'over 8888 8888'

Sun Apr 10 13:54:27 PDT 2011

On Mon, Apr 4, 2011 at 7:12 PM, Taekyun Kim <podain77 at gmail.com> wrote:
> I've done some additional work on overlapped blit functions and bilinear
> filter with A8 mask for operator OVER and ADD.
> (tight scheduled work here...)

About these things. I would really appreciate if you could send the
latest variants of your patches if you don't mind to contribute them
to pixman,

I understand that you need this functionality right now and any
improvement over the slow general C code is a great help for many
pixman users. I'm just approaching the problem in a bit different way
- first try to tweak the code to make it as fast as possible, and only
then extend and generalize it to support more operations. For example,
it would really suck to implement a few dozens of optimized NEON fast
path functions for various operations, and only then realize that
doing it in a completely different way would have provided maybe
something like 10% more performance. Anyway, here we have some
conflict of interests between us, which would be really nice to
resolve somehow.

I wonder if the best solution for everyone would be to just add the
first generation of NEON fast paths for these OVER and ADD operations
developed by you if they don't cause any regressions. Maybe in such a
way that they use their own set of helper macros. And at the same time
continue tweaking one more experimental implementation, trying to get
a 'perfect' bilinear scaling code (with another set of helper macros
in order to avoid any clash). That would help us to get a reasonable
performance for many scaled compositing operations right now, just in
time for pixman 0.22.0 release. A drawback is that there will be some
extra code duplication, but eventually we would be able clean it up
and ensure that only the fastest code remains in the future.

What do you think? And surely also the opinion of Soeren is important here.

-- 
Best regards,
Siarhei Siamashka