[Pixman] [PATCH] Add support for aarch64 neon optimization

Thu Apr 7 11:34:49 UTC 2016

On Thu, 7 Apr 2016 19:45:03 +0900
Mizuki Asakura <ed6e117f at gmail.com> wrote:

> > Do you have a more specific example of a code fragment that needs
> > conversion?  
> 
> In original pixman-arm-neon-asm.S:
> 
> .macro pixman_composite_over_8888_8_0565_process_pixblock_head
> ...
> vsli.u16    q2,  q2, #5
> ...
> vraddhn.u16 d2,  q6,  q10

This 'd2' register is in fact one of the 64-bit halves of the
128-bit 'q1' register and does not clash with 'q2'.

> ...
> vshrn.u16   d30, q2, #2
> 
> 
> If all registers just converted to Vn, it would be as follows:
> 
> .macro pixman_composite_over_8888_8_0565_process_pixblock_head
> ...
> sli    v2.8h,  v2.8h, #5
> ...
> raddhn v2.8b,  v6.8h,  v10.8h

Hence here we need to convert it to 'v1.8b'.

And if, for example, we had to convert the "vraddhn.u16 d3, q6, q10"
instruction ('d3' instead of 'd2'), then the conversion result
would change to "raddhn2 v1.16b, v6.8h, v10.8h".

> ...
> shrn   v30.8b, v2.8h, #2

For the sake of consistency, here we need 'v15.8b' instead of 'v30.8b'
too.

> 
> 
> The second raddhn corrupts v2, then the next shrn v30.8b, v2.8h #2
> would not be correct.
> 
> There are many other conflicts I've met.
> I didn't find any specification on the ARM's document that
> Dn can be a lower part of V(n/2).

I guess, the whole source of confusion is that the AArch64 syntax
has 'Dn' registers too, but they are all mapped to lower halves
of the 'Vn' registers with the same number. Which is different
from the AArch32 Dn registers naming convention.

But in order to see through the deception, we really need to pay
attention to what exactly the instruction *does* instead of how it
*looks* in the AArch64 assembler syntax. Just because:

    https://en.wikipedia.org/wiki/A_rose_by_any_other_name_would_smell_as_sweet

And as I mentioned earlier, I hope to roll out a full fledged automatic
converter soon.

-- 
Best regards,
Siarhei Siamashka