[Pixman] [PATCH 1/9 repost] armv7: Coalesce scalar accesses where possible
Ben Avison
bavison at riscosopen.org
Mon Apr 11 12:26:22 UTC 2016
Where the alignment of a block of elements is known to equal the size of the
block, but the block is smaller than 8 bytes, it is safe to use a larger
element size in a scalar VLD or VST without risking an alignment exception.
Typically the effect of this can be seen when accessing leading or trailing
halfwords or words in the destination buffer for long scanlines.
Sadly, the effect of this is too small to be measured, but it seems like a
good idea anyway.
Signed-off-by: Ben Avison <bavison at riscosopen.org>
---
pixman/pixman-arm-neon-asm.h | 4 ++++
1 files changed, 4 insertions(+), 0 deletions(-)
diff --git a/pixman/pixman-arm-neon-asm.h b/pixman/pixman-arm-neon-asm.h
index bdcf6a9..76b3985 100644
--- a/pixman/pixman-arm-neon-asm.h
+++ b/pixman/pixman-arm-neon-asm.h
@@ -183,6 +183,10 @@
pixldst30 vst3, 8, %(basereg+0), %(basereg+1), %(basereg+2), 3, mem_operand
.elseif (bpp == 24) && (numpix == 1)
pixldst30 vst3, 8, %(basereg+0), %(basereg+1), %(basereg+2), 1, mem_operand
+.elseif numpix * bpp == 32 && abits == 32
+ pixldst 4, vst1, 32, basereg, mem_operand, abits
+.elseif numpix * bpp == 16 && abits == 16
+ pixldst 2, vst1, 16, basereg, mem_operand, abits
.else
pixldst %(numpix * bpp / 8), vst1, %(bpp), basereg, mem_operand, abits
.endif
--
1.7.5.4
More information about the Pixman
mailing list