[Pixman] [PATCH] Add support for aarch64 neon optimization
Siarhei Siamashka
siarhei.siamashka at gmail.com
Tue Apr 5 07:53:29 UTC 2016
On Sun, 3 Apr 2016 20:17:45 +0900
Mizuki Asakura <ed6e117f at gmail.com> wrote:
> > The 'advanced' prefetch type is implemented by having some branchless ARM code
>
> If the prefetch code assumes that "branch-less", it cannot be done in aarch64
> since aarch64 doesn't support conditional alythmetics such as subge, subges.
>
> If so, we could / should remove all prefetch-related codes because it
> might cause
> performance regression (by branching) rather than benefit of prefetching.
Yes, I'm fine and actually in favour of removing the prefetch related
AArch64 code (assuming that it does not do anything good for us).
Something similar happened to the pixman x86 prefetch code in the past:
https://lists.freedesktop.org/archives/pixman/2010-June/000231.html
But I'm going to run some additional higher level benchmarks to be
sure.
> And also, we could remove all "tail-head" optimizatoins that is only
> for highly utilizing prefetching.
This code is not just there for prefetching. It is an example of
using software pipelining:
https://cgit.freedesktop.org/pixman/tree/pixman/pixman-arm-neon-asm.S?id=pixman-0.34.0#n191
https://en.wikipedia.org/wiki/Software_pipelining
> "tail-head" codes are very complicated, hard to understand and hard to maintain.
> If we could remove these codes, asm code could be more slimmer and
> easy-to-maintain.
If we were favouring ease of maintenance over performance, then we
would have used intrinsics instead of assembly in the first place.
> Ofcource, the modification shouldn't be applied for original
> aarch32-neon codes. It may cause
> performance regression on some architecture.
> But for aarch64, it would be a considerable changes ?
Well, just to make sure that there is no misunderstanding between
us. I would like to keep and do AArch64 conversion for all the
parts of code, which are well optimized and not planned to be
replaced in the near future. And I suggested not to bother with
the 'pixman-arm-neon-asm-bilinear.S' file, because this code
is not the best way to do the job and it had to be eventually
replaced with iterators:
https://lists.freedesktop.org/archives/pixman/2013-September/002889.html
https://lists.freedesktop.org/archives/pixman/2013-September/002892.html
Now it looks like Ben Avison has NEON patches for doing
separable bilinear scaling and, so this makes the
'pixman-arm-neon-asm-bilinear.S' file really obsolete.
The nearest scaling and rgb565 format support code is still useful.
The bilinear scaling code from pixman-arm-neon-asm.S is useful too, at
least the 'pixman_scaled_bilinear_scanline_8888_8888_SRC_asm_neon' and
'pixman_scaled_bilinear_scanline_8888_0565_SRC_asm_neon' functions.
But 'pixman_scaled_bilinear_scanline_0565_0565_SRC_asm_neon' and
'pixman_scaled_bilinear_scanline_0565_x888_SRC_asm_neon' are bad.
Anyway, your first patch was already usable. I only see that just a
few minor tweaks are needed and it will be good enough for pushing
to git. But if I'm mistaken and something is actually difficult,
then you don't need to spend too much time on it. Thanks.
--
Best regards,
Siarhei Siamashka
More information about the Pixman
mailing list