[Pixman] [PATCH 2/5] vmx: adjust macros when loading vectors on ppc64le

Thu Jun 25 04:41:37 PDT 2015

On Thu, Jun 25, 2015 at 2:05 PM, Pekka Paalanen <ppaalanen at gmail.com> wrote:
> On Tue, 16 Jun 2015 18:27:59 +0300
> Oded Gabbay <oded.gabbay at gmail.com> wrote:
>
>> From: Fernando Seiti Furusato <ferseiti at linux.vnet.ibm.com>
>>
>> Replaced usage of vec_lvsl to direct unaligned assignment
>> operation (=). That is because, according to Power ABI Specification,
>> the usage of lvsl is deprecated on ppc64le.
>>
>> Changed COMPUTE_SHIFT_{MASK,MASKS,MASKC} macro usage to no-op for powerpc
>> little endian since unaligned access is supported on ppc64le.
>>
>> Signed-off-by: Fernando Seiti Furusato <ferseiti at linux.vnet.ibm.com>
>> ---
>>  pixman/pixman-vmx.c | 54 ++++++++++++++++++++++++++++++++++++++---------------
>>  1 file changed, 39 insertions(+), 15 deletions(-)
>>
>> diff --git a/pixman/pixman-vmx.c b/pixman/pixman-vmx.c
>> index 23537f6..7e5eca7 100644
>> --- a/pixman/pixman-vmx.c
>> +++ b/pixman/pixman-vmx.c
>> @@ -136,6 +136,7 @@ over (vector unsigned int src,
>>      over (pix_multiply (src, mask),                                  \
>>            pix_multiply (srca, mask), dest)
>>
>> +#ifndef _LITTLE_ENDIAN
>>
>>  #define COMPUTE_SHIFT_MASK(source)                                   \
>>      source ## _mask = vec_lvsl (0, source);
>> @@ -151,23 +152,46 @@ over (vector unsigned int src,
>>   * Note: tmp3 and tmp4 must remain untouched!
>>   */
>>
>> -#define LOAD_VECTORS(dest, source)                     \
>> -    tmp1 = (typeof(tmp1))vec_ld (0, source);           \
>> -    tmp2 = (typeof(tmp2))vec_ld (15, source);                  \
>> -    v ## source = (typeof(v ## source))                        \
>> -     vec_perm (tmp1, tmp2, source ## _mask);           \
>> +#define LOAD_VECTORS(dest, source)            \
>> +    tmp1 = (typeof(tmp1))vec_ld (0, source);          \
>> +    tmp2 = (typeof(tmp2))vec_ld (15, source);         \
>> +    v ## source = (typeof(v ## source))           \
>> +    vec_perm (tmp1, tmp2, source ## _mask);       \
>>      v ## dest = (typeof(v ## dest))vec_ld (0, dest);
>>
>> -#define LOAD_VECTORSC(dest, source, mask)              \
>> -    tmp1 = (typeof(tmp1))vec_ld (0, source);           \
>> -    tmp2 = (typeof(tmp2))vec_ld (15, source);                  \
>> -    v ## source = (typeof(v ## source))                        \
>> -     vec_perm (tmp1, tmp2, source ## _mask);           \
>> -    tmp1 = (typeof(tmp1))vec_ld (0, mask);             \
>> -    v ## dest = (typeof(v ## dest))vec_ld (0, dest);   \
>> -    tmp2 = (typeof(tmp2))vec_ld (15, mask);            \
>> -    v ## mask = (typeof(v ## mask))                    \
>> -     vec_perm (tmp1, tmp2, mask ## _mask);
>> +#define LOAD_VECTORSC(dest, source, mask)         \
>> +    tmp1 = (typeof(tmp1))vec_ld (0, source);          \
>> +    tmp2 = (typeof(tmp2))vec_ld (15, source);         \
>> +    v ## source = (typeof(v ## source))           \
>> +    vec_perm (tmp1, tmp2, source ## _mask);       \
>> +    tmp1 = (typeof(tmp1))vec_ld (0, mask);        \
>> +    v ## dest = (typeof(v ## dest))vec_ld (0, dest);      \
>> +    tmp2 = (typeof(tmp2))vec_ld (15, mask);       \
>> +    v ## mask = (typeof(v ## mask))           \
>> +    vec_perm (tmp1, tmp2, mask ## _mask);
>
> Hi Oded,
>
> are all the above changes just about breaking whitespace? At least I
> can't spot the difference between the old and the new version. You are
> converting nicely aligned tabs into unaligned spaces.
>
> I'd suggest to not touch the whitespace.
Understood, I'll modify it.

>
> Looks like you also modified Fernando's original patch, yet your name
> is nowhere to be seen. It's common courtesy to note what you changed.
>
I think I only took part of his patch, and that part I did *not*
modify, but I will double check it, and if I indeed modified
something, I will comment about it.
btw, I wrote a comment about it in the cover letter.

>> +#else
>> +
>> +/* Now the COMPUTE_SHIFT_{MASK, MASKS, MASKC} below are just no-op.
>> + * They are defined that way because little endian altivec can do unaligned
>> + * reads natively and have no need for constructing the permutation pattern
>> + * variables.
>> + */
>> +#define COMPUTE_SHIFT_MASK(source)
>> +
>> +#define COMPUTE_SHIFT_MASKS(dest, source)
>> +
>> +#define COMPUTE_SHIFT_MASKC(dest, source, mask)
>> +
>> +# define LOAD_VECTORS(dest, source)                        \
>> +    v ## source = *((typeof(v ## source)*)source);        \
>> +    v ## dest = *((typeof(v ## dest)*)dest);
>> +
>> +# define LOAD_VECTORSC(dest, source, mask)                 \
>> +    v ## source = *((typeof(v ## source)*)source);        \
>> +    v ## dest = *((typeof(v ## dest)*)dest);              \
>> +    v ## mask = *((typeof(v ## mask)*)mask);
>> +
>> +#endif /* _LITTLE_ENDIAN */
>>
>>  #define LOAD_VECTORSM(dest, source, mask)                            \
>>      LOAD_VECTORSC (dest, source, mask)                                       \
>
> The above were the comments that finally made me not just push this
> series.
Not sure I understood this sentence. Could you please elaborate ?

>
> You're also missing a space before an opening paren (except for macro
> parameter lists), but the old code is missing them too, so no big deal I
> think.
Will fix that.
>
> My other comments for the series are:
>
> I think you updated your outstanding pull request after the reminder
> about WORDS_BIGENDIAN? Would have been nice to mention that, I almost
> didn't look at it because I was expecting a new request with the new
> change.
Yeah, I thought about doing the while sequence again, but decided just
changing the define name was not worth it. In any case, I will send a
new series now so that will also include the WORDS_BIGENDIAN thing
>
> Presumably ajax's R-b stands with the changes. At least the changes
> look as asked to me.
>
> If I understood right, you also addressed Siarhei's concerns about
> performance by changing to the simple #ifdef pattern which leaves the
> existing code path the same, right?
Correct, that produces the most efficient/optimized code.
>
> When you re-send someone else's patch that, it is a good convention to
> add your own Signed-off-by.
Correct, I forget to do it.
>
> You use lots of #ifndef with both branches populated. It's usually less
> effort for the reader to understand positives than negatives, so using
> #ifdef with the branches swapped might read better.
Agreed and I will change it now.
>
>
> Thanks,
> pq