[Mesa-dev] [PATCH] mesa: Optimize SWIZZLE_CONVERT_LOOP macro.

Kenneth Graunke kenneth at whitecape.org
Thu Aug 14 21:51:07 PDT 2014


On Thursday, August 14, 2014 08:51:24 PM Matt Turner wrote:
> Cuts about 1.5k of text size and reduces the compile time from 23~27 to
> 19 seconds.
> 
>    text    data     bss     dec     hex filename
>  243337       0       0  243337   3b689 .libs/format_utils.o
>  241807       0       0  241807   3b08f .libs/format_utils.o
> ---
> Numbers from gcc-4.8.2 on an amd64 system. Hopefully this improves
> compile time on x86 by a bunch more.
> 
>  src/mesa/main/format_utils.c | 20 ++++++++++++--------
>  1 file changed, 12 insertions(+), 8 deletions(-)
> 
> diff --git a/src/mesa/main/format_utils.c b/src/mesa/main/format_utils.c
> index 240e3bc..b24e067 100644
> --- a/src/mesa/main/format_utils.c
> +++ b/src/mesa/main/format_utils.c
> @@ -318,15 +318,19 @@ swizzle_convert_try_memcpy(void *dst, GLenum dst_type, int num_dst_channels,
>           tmp[j] = CONV;                                           \
>        }                                                           \
>                                                                    \
> -      typed_dst[0] = tmp[swizzle_x];                              \
> -      if (DST_CHANS > 1) {                                        \
> +      switch (4 - DST_CHANS) {                                    \
> +      case 3:                                                     \
> +         typed_dst[0] = tmp[swizzle_x];                           \
> +         /* fallthrough */                                        \
> +      case 2:                                                     \
>           typed_dst[1] = tmp[swizzle_y];                           \
> -         if (DST_CHANS > 2) {                                     \
> -            typed_dst[2] = tmp[swizzle_z];                        \
> -            if (DST_CHANS > 3) {                                  \
> -               typed_dst[3] = tmp[swizzle_w];                     \
> -            }                                                     \
> -         }                                                        \
> +         /* fallthrough */                                        \
> +      case 1:                                                     \
> +         typed_dst[2] = tmp[swizzle_z];                           \
> +         /* fallthrough */                                        \
> +      case 0:                                                     \
> +         typed_dst[3] = tmp[swizzle_w];                           \
> +         /* fallthrough */                                        \
>        }                                                           \
>        typed_src += SRC_CHANS;                                     \
>        typed_dst += DST_CHANS;                                     \
> 

It doesn't seem like this does the same thing...so your new code is:

      switch (4 - DST_CHANS) {
      case 3: // DST_CHANS == 1
         typed_dst[0] = tmp[swizzle_x];
      case 2: // DST_CHANS == 2
         typed_dst[1] = tmp[swizzle_y];
      case 1: // DST_CHANS == 3
         typed_dst[2] = tmp[swizzle_z];
      case 0: // DST_CHANS == 4
         typed_dst[3] = tmp[swizzle_w];
      }

So when DST_CHANS == 1...your new code would run:

         typed_dst[0] = tmp[swizzle_x];
         typed_dst[1] = tmp[swizzle_y];
         typed_dst[2] = tmp[swizzle_z];
         typed_dst[3] = tmp[swizzle_w];

and when it's 2, it would run...

         typed_dst[1] = tmp[swizzle_y];
         typed_dst[2] = tmp[swizzle_z];
         typed_dst[3] = tmp[swizzle_w];

I think instead you want:

      switch (DST_CHANS) {
      case 4:
         typed_dst[3] = tmp[swizzle_w];
         /* fallthrough */
      case 3:
         typed_dst[2] = tmp[swizzle_z];
         /* fallthrough */
      case 2:
         typed_dst[1] = tmp[swizzle_y];
         /* fallthrough */
      case 1:
         typed_dst[0] = tmp[swizzle_x];
         /* fallthrough */
      }

Impressive savings, though!

--Ken
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: This is a digitally signed message part.
URL: <http://lists.freedesktop.org/archives/mesa-dev/attachments/20140814/dc7de8ba/attachment.sig>


More information about the mesa-dev mailing list