[Pixman] [PATCH] Add support for aarch64 neon optimization
Lennart Sorensen
lsorense at csclub.uwaterloo.ca
Tue Apr 5 12:26:38 UTC 2016
On Tue, Apr 05, 2016 at 08:20:54PM +0900, Mizuki Asakura wrote:
> > This code is not just there for prefetching. It is an example of
> > using software pipelining:
>
> OK. I understand.
> But the code is very hard to maintain... I've met too many register
> conflictions.
> # q2 and d2 were used in a same sequence. It cannot be exist in aarch64-neon.
>
> Anyway, I'll try to remove unnecessary register copies as you've suggested.
> After that, I'll also tryh to make benchmarks that
> * advance vs none
> * L1 / L2 / L3 (Cortex-A53 doesn't have), keep / strm
> to find the better configuration.
>
> But it is only a result of Cortex-A53 (that you ane me have). Does anyone can
> test other (expensive :) aarch64 environment ?
> (Cortex-Axx, Apple Ax, NVidia Denver, etc, etc...)
If someone can list what to run for a test I can probably run it on an A57.
--
Len Sorensen
More information about the Pixman
mailing list