[Pixman] [PATCH 1/1] pixman: Fixing endianness problems for powerpc archs

Wed May 27 09:51:21 PDT 2015

On 05/27/2015 12:05 AM, Siarhei Siamashka wrote:

> The commit summary does not need the "pixman: " prefix. It would make
> sense to use the "vmx: " prefix instead. Also explicitly stating in
> the summary line that the problems are related to little endian powerpc
> would make it more descriptive.

Indeed, I was not really sure about that.

> Regarding the code itself, please check some comments below. If they
> are addressed, I think that the fix will be good enough to be applied.
>
>> ---
>>   pixman/pixman-vmx.c | 61 +++++++++++++++++++++++++++++++++++++++--------------
>>   1 file changed, 45 insertions(+), 16 deletions(-)
>>
>> diff --git a/pixman/pixman-vmx.c b/pixman/pixman-vmx.c
>> index c33631c..109104f 100644
>> --- a/pixman/pixman-vmx.c
>> +++ b/pixman/pixman-vmx.c
>> @@ -34,13 +34,25 @@
>>
>>   #define AVV(x...) {x}
>>
>> +#ifndef __LITTLE_ENDIAN__
>
> For the sake of consistency with the rest of the pixman code, can
> we use "#ifdef WORDS_BIGENDIAN"?

Sure, I will try and change that.

>> +#define MERGEH(a,b) vec_mergeh(a,b)
>> +#define MERGEL(a,b) vec_mergel(a,b)
>> +#define SPLAT_VEC AVV(						\
>> +		0x00, 0x00, 0x00, 0x00, 0x04, 0x04, 0x04, 0x04,	\
>> +		0x08, 0x08, 0x08, 0x08, 0x0C, 0x0C, 0x0C, 0x0C)
>> +#else
>> +#define MERGEH(a,b) vec_mergeh(b,a)
>> +#define MERGEL(a,b) vec_mergel(b,a)
>
> I'm not sure if overloading the 'mergeh'/'mergel' primitives
> with a different meaning is a good idea. I looks a bit like
> a "#define TRUE FALSE" approach, which may do the job but still
> be confusing for the people trying to read the code (yes, I'm
> exaggerating here a bit).

I did the above with some struggle, but it was the only way I could think
without adding new functions.

>
>> +#define SPLAT_VEC AVV(             				\
>> +		0x03, 0x03, 0x03, 0x03, 0x07, 0x07, 0x07, 0x07,	\
>> +		0x0B, 0x0B, 0x0B, 0x0B, 0x0F, 0x0F, 0x0F, 0x0F)
>> +#endif
>> +
>>   static force_inline vector unsigned int
>>   splat_alpha (vector unsigned int pix)
>>   {
>>       return vec_perm (pix, pix,
>> -		     (vector unsigned char)AVV (
>> -			 0x00, 0x00, 0x00, 0x00, 0x04, 0x04, 0x04, 0x04,
>> -			 0x08, 0x08, 0x08, 0x08, 0x0C, 0x0C, 0x0C, 0x0C));
>> +		     (vector unsigned char)SPLAT_VEC);
>>   }
>>
>>   static force_inline vector unsigned int
>> @@ -50,11 +62,11 @@ pix_multiply (vector unsigned int p, vector unsigned int a)
>>
>>       /* unpack to short */
>>       hi = (vector unsigned short)
>> -	vec_mergeh ((vector unsigned char)AVV (0),
>> +	MERGEH ((vector unsigned char)AVV (0),
>>   		    (vector unsigned char)p);
>
> This part of the code takes 8-bit elements from the high half of a
> vector register, extends them to 16-bit and stores the result to
> another vector register.
>
> Can we introduce a new static force_inline function specifically for
> this primitive operation and take care of the endian dependent stuff
> there without introducing the MERGEH macro?

Yes, I think so. I will do that and see if it works. Do you think there is
anything else besides vec_merge{h,l} that should be changed?
By the way, should I do that to both splat_alpha and pix_multiply? (With this I
could drop SPLA_VEC macro as well)

>>       lo = vec_mladd (lo, mod, (vector unsigned short)
>> @@ -125,25 +137,31 @@ over (vector unsigned int src,
>>   }
>>
>>   /* in == pix_multiply */
>> +#ifndef __LITTLE_ENDIAN__
>> +#define MASK_SHIFT(pointer) vec_lvsl(0,pointer)
>> +#else
>> +#define MASK_SHIFT(pointer) *((typeof(pointer ## _mask)*)pointer)
>
> The big endian MASK_SHIFT variant is calculating the permutation
> pattern for doing unaligned reads. And the result depends on the
> alignment of the 'pointer' argument relative to 16 byte boundaries
> in memory.
>
> But this little endian MASK_SHIFT is just performing a memory read
> instead, which does not make much sense.
>
>  From what I can see, the little endian altivec just can do unaligned
> reads natively and have no need for constructing the permutation
> pattern variables at all. The COMPUTE_SHIFT_MASK macros should be
> just no-ops there because the little endian LOAD_VECTORS macros do
> not use the 'vec_perm' intrinsic.
>
> Could you please remove these redundant bits from the little endian
> code path and also add a comment explaining the unaligned vector
> reads handling differences between big endian and little endian
> powerpc variants?

You are right. I just added it as a filler. Shame on me. Let me see what I can
do about it.

Thanks a lot for the review!
-- 

Fernando Seiti Furusato
IBM Linux Technology Center