[Mesa-dev] [PATCH] mesa: Optimize SWIZZLE_CONVERT_LOOP macro.
Kenneth Graunke
kenneth at whitecape.org
Thu Aug 14 21:51:07 PDT 2014
On Thursday, August 14, 2014 08:51:24 PM Matt Turner wrote:
> Cuts about 1.5k of text size and reduces the compile time from 23~27 to
> 19 seconds.
>
> text data bss dec hex filename
> 243337 0 0 243337 3b689 .libs/format_utils.o
> 241807 0 0 241807 3b08f .libs/format_utils.o
> ---
> Numbers from gcc-4.8.2 on an amd64 system. Hopefully this improves
> compile time on x86 by a bunch more.
>
> src/mesa/main/format_utils.c | 20 ++++++++++++--------
> 1 file changed, 12 insertions(+), 8 deletions(-)
>
> diff --git a/src/mesa/main/format_utils.c b/src/mesa/main/format_utils.c
> index 240e3bc..b24e067 100644
> --- a/src/mesa/main/format_utils.c
> +++ b/src/mesa/main/format_utils.c
> @@ -318,15 +318,19 @@ swizzle_convert_try_memcpy(void *dst, GLenum dst_type, int num_dst_channels,
> tmp[j] = CONV; \
> } \
> \
> - typed_dst[0] = tmp[swizzle_x]; \
> - if (DST_CHANS > 1) { \
> + switch (4 - DST_CHANS) { \
> + case 3: \
> + typed_dst[0] = tmp[swizzle_x]; \
> + /* fallthrough */ \
> + case 2: \
> typed_dst[1] = tmp[swizzle_y]; \
> - if (DST_CHANS > 2) { \
> - typed_dst[2] = tmp[swizzle_z]; \
> - if (DST_CHANS > 3) { \
> - typed_dst[3] = tmp[swizzle_w]; \
> - } \
> - } \
> + /* fallthrough */ \
> + case 1: \
> + typed_dst[2] = tmp[swizzle_z]; \
> + /* fallthrough */ \
> + case 0: \
> + typed_dst[3] = tmp[swizzle_w]; \
> + /* fallthrough */ \
> } \
> typed_src += SRC_CHANS; \
> typed_dst += DST_CHANS; \
>
It doesn't seem like this does the same thing...so your new code is:
switch (4 - DST_CHANS) {
case 3: // DST_CHANS == 1
typed_dst[0] = tmp[swizzle_x];
case 2: // DST_CHANS == 2
typed_dst[1] = tmp[swizzle_y];
case 1: // DST_CHANS == 3
typed_dst[2] = tmp[swizzle_z];
case 0: // DST_CHANS == 4
typed_dst[3] = tmp[swizzle_w];
}
So when DST_CHANS == 1...your new code would run:
typed_dst[0] = tmp[swizzle_x];
typed_dst[1] = tmp[swizzle_y];
typed_dst[2] = tmp[swizzle_z];
typed_dst[3] = tmp[swizzle_w];
and when it's 2, it would run...
typed_dst[1] = tmp[swizzle_y];
typed_dst[2] = tmp[swizzle_z];
typed_dst[3] = tmp[swizzle_w];
I think instead you want:
switch (DST_CHANS) {
case 4:
typed_dst[3] = tmp[swizzle_w];
/* fallthrough */
case 3:
typed_dst[2] = tmp[swizzle_z];
/* fallthrough */
case 2:
typed_dst[1] = tmp[swizzle_y];
/* fallthrough */
case 1:
typed_dst[0] = tmp[swizzle_x];
/* fallthrough */
}
Impressive savings, though!
--Ken
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: This is a digitally signed message part.
URL: <http://lists.freedesktop.org/archives/mesa-dev/attachments/20140814/dc7de8ba/attachment.sig>
More information about the mesa-dev
mailing list