[Pixman] [PATCH 1/1 v2] vmx: workarounds to fix powerpc little endian particularities

Oded Gabbay oded.gabbay at gmail.com
Mon Jun 8 02:57:06 PDT 2015


On Wed, Jun 3, 2015 at 6:42 AM, Siarhei Siamashka
<siarhei.siamashka at gmail.com> wrote:
>> +                                       AVV (endian_xor.c[1]),0);
>> +    perm = vec_xor (perm,(vector unsigned char) AVV (
>> +                       0x00, 0x00, 0x00, 0x00, 0x04, 0x04, 0x04, 0x04,
>> +                       0x08, 0x08, 0x08, 0x08, 0x0C, 0x0C, 0x0C, 0x0C));
>> +    return vec_perm (pix, pix, perm);
>>  }
>
> For this part, both the original and the patched code resulted in
> identical instruction sequences:
>
> 0000000000000000 <.vmx_splat_alpha>:
>    0:   3d 22 00 00     addis   r9,r2,0
>    4:   39 29 00 00     addi    r9,r9,0
>    8:   7c 00 48 ce     lvx     v0,0,r9
>    c:   10 42 10 2b     vperm   v2,v2,v2,v0
>   10:   4e 80 00 20     blr
>
> This is actually good. I was afraid that the compiler might screw up
> it a bit and do something stupid like adding an extra VXOR instruction
> here (for the 'vec_xor' intrinsic).
>

Actually, I get a different disassembly:

0000000000007b10 <vmx_splat_alpha>:
    7b10:       00 00 4c 3c     addis   r2,r12,0
    7b14:       00 00 42 38     addi    r2,r2,0
    7b18:       00 00 22 3d     addis   r9,r2,0
    7b1c:       0c 03 23 10     vspltisb v1,3
    7b20:       00 00 29 39     addi    r9,r9,0
    7b24:       99 4e 00 7c     lxvd2x  vs32,0,r9
    7b28:       57 02 00 f0     xxswapd vs32,vs32
    7b2c:       d7 04 01 f0     xxlxor  vs32,vs33,vs32
    7b30:       17 05 00 f0     xxlnor  vs32,vs32,vs32
    7b34:       2b 10 42 10     vperm   v2,v2,v2,v0
    7b38:       20 00 80 4e     blr

And without the patch, I get this:

0000000000007930 <vmx_splat_alpha>:
    7930:       00 00 4c 3c     addis   r2,r12,0
    7934:       00 00 42 38     addi    r2,r2,0
    7938:       00 00 22 3d     addis   r9,r2,0
    793c:       00 00 29 39     addi    r9,r9,0
    7940:       98 4e 00 7c     lxvd2x  vs0,0,r9
    7944:       50 02 00 f0     xxswapd vs0,vs0
    7948:       11 05 00 f0     xxlnor  vs32,vs0,vs0
    794c:       2b 10 42 10     vperm   v2,v2,v2,v0
    7950:       20 00 80 4e     blr

So there is an added vspltisb + xxlxor command.
I used the default configure+make.
Maybe I need to define some special flag to the compiler ?

This is my gcc version:
gcc (GCC) 4.8.3 20140911 (Red Hat 4.8.3-9)
I'm running RHEL 7.1 ppc64le on POWER8 machine.

Oded


More information about the Pixman mailing list