[Pixman] [PATCH 1/1] pixman: Fixing endianness problems for powerpc archs
Fernando Seiti Furusato
ferseiti at linux.vnet.ibm.com
Wed May 27 09:51:21 PDT 2015
On 05/27/2015 12:05 AM, Siarhei Siamashka wrote:
> The commit summary does not need the "pixman: " prefix. It would make
> sense to use the "vmx: " prefix instead. Also explicitly stating in
> the summary line that the problems are related to little endian powerpc
> would make it more descriptive.
Indeed, I was not really sure about that.
> Regarding the code itself, please check some comments below. If they
> are addressed, I think that the fix will be good enough to be applied.
>
>> ---
>> pixman/pixman-vmx.c | 61 +++++++++++++++++++++++++++++++++++++++--------------
>> 1 file changed, 45 insertions(+), 16 deletions(-)
>>
>> diff --git a/pixman/pixman-vmx.c b/pixman/pixman-vmx.c
>> index c33631c..109104f 100644
>> --- a/pixman/pixman-vmx.c
>> +++ b/pixman/pixman-vmx.c
>> @@ -34,13 +34,25 @@
>>
>> #define AVV(x...) {x}
>>
>> +#ifndef __LITTLE_ENDIAN__
>
> For the sake of consistency with the rest of the pixman code, can
> we use "#ifdef WORDS_BIGENDIAN"?
Sure, I will try and change that.
>> +#define MERGEH(a,b) vec_mergeh(a,b)
>> +#define MERGEL(a,b) vec_mergel(a,b)
>> +#define SPLAT_VEC AVV( \
>> + 0x00, 0x00, 0x00, 0x00, 0x04, 0x04, 0x04, 0x04, \
>> + 0x08, 0x08, 0x08, 0x08, 0x0C, 0x0C, 0x0C, 0x0C)
>> +#else
>> +#define MERGEH(a,b) vec_mergeh(b,a)
>> +#define MERGEL(a,b) vec_mergel(b,a)
>
> I'm not sure if overloading the 'mergeh'/'mergel' primitives
> with a different meaning is a good idea. I looks a bit like
> a "#define TRUE FALSE" approach, which may do the job but still
> be confusing for the people trying to read the code (yes, I'm
> exaggerating here a bit).
I did the above with some struggle, but it was the only way I could think
without adding new functions.
>
>> +#define SPLAT_VEC AVV( \
>> + 0x03, 0x03, 0x03, 0x03, 0x07, 0x07, 0x07, 0x07, \
>> + 0x0B, 0x0B, 0x0B, 0x0B, 0x0F, 0x0F, 0x0F, 0x0F)
>> +#endif
>> +
>> static force_inline vector unsigned int
>> splat_alpha (vector unsigned int pix)
>> {
>> return vec_perm (pix, pix,
>> - (vector unsigned char)AVV (
>> - 0x00, 0x00, 0x00, 0x00, 0x04, 0x04, 0x04, 0x04,
>> - 0x08, 0x08, 0x08, 0x08, 0x0C, 0x0C, 0x0C, 0x0C));
>> + (vector unsigned char)SPLAT_VEC);
>> }
>>
>> static force_inline vector unsigned int
>> @@ -50,11 +62,11 @@ pix_multiply (vector unsigned int p, vector unsigned int a)
>>
>> /* unpack to short */
>> hi = (vector unsigned short)
>> - vec_mergeh ((vector unsigned char)AVV (0),
>> + MERGEH ((vector unsigned char)AVV (0),
>> (vector unsigned char)p);
>
> This part of the code takes 8-bit elements from the high half of a
> vector register, extends them to 16-bit and stores the result to
> another vector register.
>
> Can we introduce a new static force_inline function specifically for
> this primitive operation and take care of the endian dependent stuff
> there without introducing the MERGEH macro?
Yes, I think so. I will do that and see if it works. Do you think there is
anything else besides vec_merge{h,l} that should be changed?
By the way, should I do that to both splat_alpha and pix_multiply? (With this I
could drop SPLA_VEC macro as well)
>> lo = vec_mladd (lo, mod, (vector unsigned short)
>> @@ -125,25 +137,31 @@ over (vector unsigned int src,
>> }
>>
>> /* in == pix_multiply */
>> +#ifndef __LITTLE_ENDIAN__
>> +#define MASK_SHIFT(pointer) vec_lvsl(0,pointer)
>> +#else
>> +#define MASK_SHIFT(pointer) *((typeof(pointer ## _mask)*)pointer)
>
> The big endian MASK_SHIFT variant is calculating the permutation
> pattern for doing unaligned reads. And the result depends on the
> alignment of the 'pointer' argument relative to 16 byte boundaries
> in memory.
>
> But this little endian MASK_SHIFT is just performing a memory read
> instead, which does not make much sense.
>
> From what I can see, the little endian altivec just can do unaligned
> reads natively and have no need for constructing the permutation
> pattern variables at all. The COMPUTE_SHIFT_MASK macros should be
> just no-ops there because the little endian LOAD_VECTORS macros do
> not use the 'vec_perm' intrinsic.
>
> Could you please remove these redundant bits from the little endian
> code path and also add a comment explaining the unaligned vector
> reads handling differences between big endian and little endian
> powerpc variants?
You are right. I just added it as a filler. Shame on me. Let me see what I can
do about it.
Thanks a lot for the review!
--
Fernando Seiti Furusato
IBM Linux Technology Center
More information about the Pixman
mailing list