[Pixman] [PATCH] Add support for aarch64 neon optimization
Siarhei Siamashka
siarhei.siamashka at gmail.com
Tue Apr 5 14:12:13 UTC 2016
On Tue, 5 Apr 2016 08:26:38 -0400
"Lennart Sorensen" <lsorense at csclub.uwaterloo.ca> wrote:
> On Tue, Apr 05, 2016 at 08:20:54PM +0900, Mizuki Asakura wrote:
> > > This code is not just there for prefetching. It is an example of
> > > using software pipelining:
> >
> > OK. I understand.
> > But the code is very hard to maintain... I've met too many register
> > conflictions.
> > # q2 and d2 were used in a same sequence. It cannot be exist in aarch64-neon.
> >
> > Anyway, I'll try to remove unnecessary register copies as you've suggested.
> > After that, I'll also tryh to make benchmarks that
> > * advance vs none
> > * L1 / L2 / L3 (Cortex-A53 doesn't have), keep / strm
> > to find the better configuration.
> >
> > But it is only a result of Cortex-A53 (that you ane me have). Does anyone can
> > test other (expensive :) aarch64 environment ?
> > (Cortex-Axx, Apple Ax, NVidia Denver, etc, etc...)
>
> If someone can list what to run for a test I can probably run it on an A57.
Hi Lennart,
This is great, thanks. Could you please clone the following branch?
https://cgit.freedesktop.org/~siamashka/pixman/log/?h=20160405-separable-neon-bilinear-test
And then try to compile static 32-bit pixman test programs using an
ARM crosscompiler?
./autogen.sh
./configure --host=arm-linux-gnueabihf --enable-static-testprogs \
--disable-libpng --disable-gtk
make
Then run the "scaling-bench" program from the "test" directory on your
A57 device?
PIXMAN_DISABLE="" ./scaling-bench > cortex-a57-neon-single-pass.txt
PIXMAN_DISABLE="wholeops" ./scaling-bench > cortex-a57-neon-separable.txt
This information can be used to see whether the Cortex-A57 fits a
common pattern observed with other ARM processors:
https://people.freedesktop.org/~siamashka/files/20160405-arm-bilinear/
I suspect that it will show results similar to Cortex-A15, but we will
never know until we try.
This can help to identify an optimal bilinear scaling strategy. And
also decide which parts of the existing 32-bit ARM assembly code are
worth converting to AArch64.
--
Best regards,
Siarhei Siamashka
More information about the Pixman
mailing list