[Pixman] [PATCH 2/3] mmx: fix unaligned accesses
Matt Turner
mattst88 at gmail.com
Sun Jul 31 20:29:58 PDT 2011
On Sat, Jul 23, 2011 at 10:28 PM, Siarhei Siamashka
<siarhei.siamashka at gmail.com> wrote:
> The 'test1' function does not look good because it uses ARM
> instructions to read data one byte at a time and combine it. Function
> 'test2' looks a bit better because it now uses WALIGNR, but this is
> still not an optimal solution. Ideally, if we need to read N
> contiguous unaligned 64-bit values, this requires (N + 1) loads via
> WLDRD instructions and N fixups via WALIGNR, also shift argument for
> WALIGNR has to be calculated only once.
Do you think I should do separate while loops for unaligned and aligned?
i.e., in pixman_blt_mmx,
if ((unsigned long)s & 7) {
while (w >= 64) {
/* src is aligned. no walign fix-ups needed */
}
} else {
while (w >= 64) {
/* src is unaligned. N+1 loads and N fix-ups needed */
}
}
walign{i,r} have 1-cycle latency and throughput, and back-to-back
w{ld,st}rd instructions seem to cause a stall. So it seems to me that
I can put walign instructions between wldrd instructions and not lose
any performance, even when the loads are aligned.
Thanks,
Matt
More information about the Pixman
mailing list