[Mesa-dev] [PATCH] mesa: Optimize SWIZZLE_CONVERT_LOOP macro.

Matt Turner mattst88 at gmail.com
Thu Aug 14 22:00:45 PDT 2014


On Thu, Aug 14, 2014 at 9:51 PM, Kenneth Graunke <kenneth at whitecape.org> wrote:
> On Thursday, August 14, 2014 08:51:24 PM Matt Turner wrote:
>> Cuts about 1.5k of text size and reduces the compile time from 23~27 to
>> 19 seconds.
>>
>>    text    data     bss     dec     hex filename
>>  243337       0       0  243337   3b689 .libs/format_utils.o
>>  241807       0       0  241807   3b08f .libs/format_utils.o
>> ---
>> Numbers from gcc-4.8.2 on an amd64 system. Hopefully this improves
>> compile time on x86 by a bunch more.
>>
>>  src/mesa/main/format_utils.c | 20 ++++++++++++--------
>>  1 file changed, 12 insertions(+), 8 deletions(-)
>>
>> diff --git a/src/mesa/main/format_utils.c b/src/mesa/main/format_utils.c
>> index 240e3bc..b24e067 100644
>> --- a/src/mesa/main/format_utils.c
>> +++ b/src/mesa/main/format_utils.c
>> @@ -318,15 +318,19 @@ swizzle_convert_try_memcpy(void *dst, GLenum dst_type, int num_dst_channels,
>>           tmp[j] = CONV;                                           \
>>        }                                                           \
>>                                                                    \
>> -      typed_dst[0] = tmp[swizzle_x];                              \
>> -      if (DST_CHANS > 1) {                                        \
>> +      switch (4 - DST_CHANS) {                                    \
>> +      case 3:                                                     \
>> +         typed_dst[0] = tmp[swizzle_x];                           \
>> +         /* fallthrough */                                        \
>> +      case 2:                                                     \
>>           typed_dst[1] = tmp[swizzle_y];                           \
>> -         if (DST_CHANS > 2) {                                     \
>> -            typed_dst[2] = tmp[swizzle_z];                        \
>> -            if (DST_CHANS > 3) {                                  \
>> -               typed_dst[3] = tmp[swizzle_w];                     \
>> -            }                                                     \
>> -         }                                                        \
>> +         /* fallthrough */                                        \
>> +      case 1:                                                     \
>> +         typed_dst[2] = tmp[swizzle_z];                           \
>> +         /* fallthrough */                                        \
>> +      case 0:                                                     \
>> +         typed_dst[3] = tmp[swizzle_w];                           \
>> +         /* fallthrough */                                        \
>>        }                                                           \
>>        typed_src += SRC_CHANS;                                     \
>>        typed_dst += DST_CHANS;                                     \
>>
>
> It doesn't seem like this does the same thing...so your new code is:
>
>       switch (4 - DST_CHANS) {
>       case 3: // DST_CHANS == 1
>          typed_dst[0] = tmp[swizzle_x];
>       case 2: // DST_CHANS == 2
>          typed_dst[1] = tmp[swizzle_y];
>       case 1: // DST_CHANS == 3
>          typed_dst[2] = tmp[swizzle_z];
>       case 0: // DST_CHANS == 4
>          typed_dst[3] = tmp[swizzle_w];
>       }
>
> So when DST_CHANS == 1...your new code would run:
>
>          typed_dst[0] = tmp[swizzle_x];
>          typed_dst[1] = tmp[swizzle_y];
>          typed_dst[2] = tmp[swizzle_z];
>          typed_dst[3] = tmp[swizzle_w];
>
> and when it's 2, it would run...
>
>          typed_dst[1] = tmp[swizzle_y];
>          typed_dst[2] = tmp[swizzle_z];
>          typed_dst[3] = tmp[swizzle_w];
>
> I think instead you want:
>
>       switch (DST_CHANS) {
>       case 4:
>          typed_dst[3] = tmp[swizzle_w];
>          /* fallthrough */
>       case 3:
>          typed_dst[2] = tmp[swizzle_z];
>          /* fallthrough */
>       case 2:
>          typed_dst[1] = tmp[swizzle_y];
>          /* fallthrough */
>       case 1:
>          typed_dst[0] = tmp[swizzle_x];
>          /* fallthrough */
>       }

Yeah, I think you're right. I was trying to do it so that the writes
were in order. Will think about it more tomorrow.


More information about the mesa-dev mailing list