[Spice-devel] [PATCH spice-common v2] codegen: Make the compiler work out better way to write unaligned memory
Frediano Ziglio
fziglio at redhat.com
Tue Sep 19 11:55:10 UTC 2017
>
> On Mon, Sep 18, 2017 at 10:27:01AM -0400, Frediano Ziglio wrote:
> > >
> > > Has this been tested on actual problematic hardware, and improved
> > > performance? I remember last time this was discussed, the outcome was
> > > not always what was expected.
> > >
> > > Christophe
> > >
> >
> > I don't remember old discussions, there were actually 2 different ones
> > which were confusing as one was referring an old embedded ARM version
> > and another a more recent.
> > I remember a memcpy version was proposed.
> > I don't own an ARM to test.
> > Was tested with x64 with no regressions introduced.
> > Knowing the machine instruction, which one fault is causing the issue
> > and the different platforms I can try and verify the assembly output
> > cross compiling.
>
> Given there were past attempts which should have been working, but did
> not when testing on real hardware, I'd prefer that we postpone this
> patch until someone could run tests on real hardware. This is in an area
> of the code which does not change much, so it should not need painful
> rebases.
>
> Christophe
>
I don't fully agree. I agree code should be tested. On the other side
we had 2 reports. One (from Anton D. Kachalov, November 2015) stating an
old ARMv5 have problems with unaligned access. Another stating that a more
recent 32 bit ARM version has problems with unaligned 64 bit integers.
We attempted some experiment asking to Anton to do some tests. Is true
we didn't succeeded but is also true that after a couple of attempts
Anton stopped doing testing so now we don't know if last proposal (or
this one) works. On the other side everybody that can read assembly
code and ARM manual can prove that generated code cause unaligned access
while with this patch the generated code does not present this issue.
Tried to compile the demarshal code with a cross compiler for ARMv5,
the code became:
000080d4 <parse_msg_disconnecting>:
80d4: e280300c add r3, r0, #12
80d8: e1530001 cmp r3, r1
80dc: 8a000025 bhi 8178 <parse_msg_disconnecting+0xa4>
80e0: e92d41f0 push {r4, r5, r6, r7, r8, lr}
80e4: e1a04000 mov r4, r0
80e8: e3a00010 mov r0, #16
80ec: e1a05002 mov r5, r2
80f0: ebfffffe bl 0 <malloc>
80f4: e3500000 cmp r0, #0
80f8: 0a00001c beq 8170 <parse_msg_disconnecting+0x9c>
80fc: e3a0c010 mov ip, #16
8100: e5d4e001 ldrb lr, [r4, #1]
8104: e5d43005 ldrb r3, [r4, #5]
8108: e5d42000 ldrb r2, [r4]
810c: e5d41004 ldrb r1, [r4, #4]
8110: e5d46002 ldrb r6, [r4, #2]
8114: e182240e orr r2, r2, lr, lsl #8
8118: e1811403 orr r1, r1, r3, lsl #8
811c: e5d4e009 ldrb lr, [r4, #9]
8120: e5d43008 ldrb r3, [r4, #8]
8124: e1822c16 orr r2, r2, r6, lsl ip
8128: e183340e orr r3, r3, lr, lsl #8
812c: e5d46003 ldrb r6, [r4, #3]
8130: e5d4e006 ldrb lr, [r4, #6]
8134: e5d4700a ldrb r7, [r4, #10]
8138: e1822c06 orr r2, r2, r6, lsl #24
813c: e1811c1e orr r1, r1, lr, lsl ip
8140: e5d46007 ldrb r6, [r4, #7]
8144: e5d4e00b ldrb lr, [r4, #11]
8148: e1833c17 orr r3, r3, r7, lsl ip
814c: e1833c0e orr r3, r3, lr, lsl #24
8150: e1811c06 orr r1, r1, r6, lsl #24
8154: e5802000 str r2, [r0]
8158: e980000a stmib r0, {r1, r3}
815c: e59f201c ldr r2, [pc, #28] ; 8180
<parse_msg_disconnecting+0xac>
8160: e59d3018 ldr r3, [sp, #24]
8164: e583c000 str ip, [r3]
8168: e5852000 str r2, [r5]
816c: e8bd81f0 pop {r4, r5, r6, r7, r8, pc}
8170: e3a00000 mov r0, #0
8174: e8bd81f0 pop {r4, r5, r6, r7, r8, pc}
8178: e3a00000 mov r0, #0
817c: e12fff1e bx lr
8180: 00000000 .word 0x00000000
which hardly will trigger an unaligned access (all the read accesses
are byte one).
So the end results is that as we cannot test on a real ARMv5 we
don't support ARMv6 and ARMv7 properly either.
I would prefer to have the patch tested with a 32 bit ARM and
integrated. If somebody came with an ARMv5 (now they are pretty
old) we'll fix it.
Frediano
More information about the Spice-devel
mailing list