[Mesa-dev] [PATCH 06/12] main/format_utils: Add a general format conversion function
Jason Ekstrand
jason at jlekstrand.net
Fri Jul 18 18:20:51 PDT 2014
Brian,
Thanks for reviewing. I'll try to get your comments incorperated and get a
v2 sent out on Monday.
On Fri, Jul 18, 2014 at 8:14 AM, Brian Paul <brianp at vmware.com> wrote:
> On 07/17/2014 12:04 PM, Jason Ekstrand wrote:
>
>> Most format conversion operations required by GL can be performed by
>> converting one channel at a time, shuffling the channels around, and
>> optionally filling missing channels with zeros and ones. This adds a
>> function to do just that in a general, yet efficient, way.
>>
>> Signed-off-by: Jason Ekstrand <jason.ekstrand at intel.com>
>> ---
>> src/mesa/main/format_utils.c | 566 ++++++++++++++++++++++++++++++
>> +++++++++++++
>> src/mesa/main/format_utils.h | 18 ++
>> 2 files changed, 584 insertions(+)
>>
>> diff --git a/src/mesa/main/format_utils.c b/src/mesa/main/format_utils.c
>> index 241c158..0cb3eae 100644
>> --- a/src/mesa/main/format_utils.c
>> +++ b/src/mesa/main/format_utils.c
>> @@ -54,3 +54,569 @@ _mesa_srgb_ubyte_to_linear_float(uint8_t cl)
>>
>> return lut[cl];
>> }
>> +
>> +static bool
>> +swizzle_convert_try_memcpy(void *dst, GLenum dst_type, int
>> num_dst_channels,
>> + const void *src, GLenum src_type, int
>> num_src_channels,
>> + const uint8_t swizzle[4], bool normalized,
>> int count)
>>
>
> Please add a comment on this function describing the parameters and what
> the return value means.
Done
>
>
>
> +{
>> + int i;
>> +
>> + if (src_type != dst_type)
>> + return false;
>> + if (num_src_channels != num_dst_channels)
>> + return false;
>> +
>> + for (i = 0; i < num_dst_channels; ++i)
>> + if (swizzle[i] != i && swizzle[i] != MESA_FORMAT_SWIZZLE_NONE)
>> + return false;
>> +
>> + memcpy(dst, src, count * num_src_channels *
>> _mesa_sizeof_type(src_type));
>> +
>> + return true;
>> +}
>> +
>> +/* Note: This loop is carefully crafted for performance. Be careful when
>> + * changing it and run some benchmarks to ensure no performance
>> regressions
>> + * if you do.
>> + */
>>
>
> Comments for the macro's parameters might be nice. And a comment saying
> what the macro actually does.
>
>
done
>
>
> +#define SWIZZLE_CONVERT_LOOP(DST_TYPE, SRC_TYPE, CONV) \
>> + do { \
>> + const SRC_TYPE *typed_src = void_src; \
>> + DST_TYPE *typed_dst = void_dst; \
>> + DST_TYPE tmp[7]; \
>> + tmp[4] = 0; \
>> + tmp[5] = one; \
>> + for (s = 0; s < count; ++s) { \
>> + for (j = 0; j < num_src_channels; ++j) { \
>> + SRC_TYPE src = typed_src[j]; \
>> + tmp[j] = CONV; \
>> + } \
>> + \
>> + typed_dst[0] = tmp[swizzle_x]; \
>> + if (num_dst_channels > 1) { \
>> + typed_dst[1] = tmp[swizzle_y]; \
>> + if (num_dst_channels > 2) { \
>> + typed_dst[2] = tmp[swizzle_z]; \
>> + if (num_dst_channels > 3) { \
>> + typed_dst[3] = tmp[swizzle_w]; \
>> + } \
>> + } \
>> + } \
>>
>
> In other places in Mesa we do that sort of thing with a switch statement
> with fall-throughs. That might be even more efficient. In the common
> case, there's 4 channels so you're always doing 3 ifs. An optimized switch
> could be one computed jump.
The primary reason why I chose 3 ifs instead of a switch is to ensure that
the writes always happen in-order. In my testing, 3 ifs is actually
slightly faster than the switch (also faster than a switch with 1, 2, 3, or
4 converts in each section). That said, I just did some more
experimentation with forcing the loops to unroll like was done before in
the uchar case and it seems to be substantially better. (I had tried that
before but it didn't help much. I'm not sure what chanaged there.)
When doing the forced-unroll (described above), I don't notice any
difference between the 3 ifs and the switch. Maybe the compiler re-orders
the writes or maybe it doesn't matter. If you like the switch better
cosmetically, I can do that.
>
>
>
> + typed_src += num_src_channels; \
>> + typed_dst += num_dst_channels; \
>> + } \
>> + } while (0);
>> +
>> +/**
>> + * Convert between array-based color formats.
>> + *
>> + * Most format conversion operations required by GL can be performed by
>> + * converting one channel at a time, shuffling the channels around, and
>> + * optionally filling missing channels with zeros and ones. This
>> function
>> + * does just that in a general, yet efficient, way.
>> + *
>> + * Most of the parameters are self-explanitory. The swizzle parameter is
>>
>
> explanatory
>
>
>
> + * an array of 4 numbers (see _mesa_get_format_swizzle) that describes
>> + * where each channel in the destination should come from in the source.
>> + *
>> + * Under most circumstances, the source and destination images must be
>> + * different as no care is taken not to clobber one with the other.
>> + * However, if they have the same number of bits per pixel, it is safe to
>> + * do an in-place conversion.
>>
>
> Please document the function parameters too.
>
>
done
>
>
> + */
>> +void
>> +_mesa_swizzle_and_convert(void *void_dst, GLenum dst_type, int
>> num_dst_channels,
>> + const void *void_src, GLenum src_type, int
>> num_src_channels,
>> + const uint8_t swizzle[4], bool normalized, int
>> count)
>> +{
>> + int s, j;
>> + register uint8_t swizzle_x, swizzle_y, swizzle_z, swizzle_w;
>> +
>> + if (swizzle_convert_try_memcpy(void_dst, dst_type, num_dst_channels,
>> + void_src, src_type, num_src_channels,
>> + swizzle, normalized, count))
>> + return;
>> +
>> + swizzle_x = swizzle[0];
>> + swizzle_y = swizzle[1];
>> + swizzle_z = swizzle[2];
>> + swizzle_w = swizzle[3];
>> +
>> + switch (dst_type) {
>> + case GL_FLOAT:
>> + {
>> + const float one = 1.0f;
>> + switch (src_type) {
>> + case GL_FLOAT:
>> + SWIZZLE_CONVERT_LOOP(float, float, src)
>> + break;
>> + case GL_HALF_FLOAT:
>> + SWIZZLE_CONVERT_LOOP(float, uint16_t, _mesa_half_to_float(src))
>>
>
> We generally use the GL GLubyte, GLushort, etc. types instead of uint8_t,
> uint16_t, etc. when dealing with OpenGL data. I realize the later are OK,
> but the former would be more consistent with other code.
I was told by people on the mesa team at Intel that we were trying to get
away from the GL datatypes and not to use them in new code unless it is a
GL API entrypoint. Honestly, I don't care. I can regex the file and
convert it easily enough.
>
>
>
> + break;
>> + case GL_UNSIGNED_BYTE:
>> + if (normalized) {
>> + SWIZZLE_CONVERT_LOOP(float, uint8_t, UBYTE_TO_FLOAT(src))
>> + } else {
>> + SWIZZLE_CONVERT_LOOP(float, uint8_t, src)
>> + }
>> + break;
>> + case GL_BYTE:
>> + if (normalized) {
>> + SWIZZLE_CONVERT_LOOP(float, int8_t, BYTE_TO_FLOAT(src))
>> + } else {
>> + SWIZZLE_CONVERT_LOOP(float, int8_t, src)
>> + }
>> + break;
>> + case GL_UNSIGNED_SHORT:
>> + if (normalized) {
>> + SWIZZLE_CONVERT_LOOP(float, uint16_t, USHORT_TO_FLOAT(src))
>> + } else {
>> + SWIZZLE_CONVERT_LOOP(float, uint16_t, src)
>> + }
>> + break;
>> + case GL_SHORT:
>> + if (normalized) {
>> + SWIZZLE_CONVERT_LOOP(float, int16_t, SHORT_TO_FLOAT(src))
>> + } else {
>> + SWIZZLE_CONVERT_LOOP(float, int16_t, src)
>> + }
>> + break;
>> + case GL_UNSIGNED_INT:
>> + if (normalized) {
>> + SWIZZLE_CONVERT_LOOP(float, uint32_t, UINT_TO_FLOAT(src))
>> + } else {
>> + SWIZZLE_CONVERT_LOOP(float, uint32_t, src)
>> + }
>> + break;
>> + case GL_INT:
>> + if (normalized) {
>> + SWIZZLE_CONVERT_LOOP(float, int32_t, INT_TO_FLOAT(src))
>> + } else {
>> + SWIZZLE_CONVERT_LOOP(float, int32_t, src)
>> + }
>> + break;
>> + default:
>> + assert(!"Invalid channel type combination");
>> + }
>> + } break;
>>
>
> break on next line.
Sure, I can change that
>
>
>
> + case GL_HALF_FLOAT:
>> + {
>> + const uint16_t one = _mesa_float_to_half(1.0f);
>> + switch (src_type) {
>> + case GL_FLOAT:
>> + SWIZZLE_CONVERT_LOOP(uint16_t, float, _mesa_float_to_half(src))
>> + break;
>> + case GL_HALF_FLOAT:
>> + SWIZZLE_CONVERT_LOOP(uint16_t, uint16_t, src)
>> + break;
>> + case GL_UNSIGNED_BYTE:
>> + if (normalized) {
>> + SWIZZLE_CONVERT_LOOP(uint16_t, uint8_t,
>> _mesa_float_to_half(UBYTE_TO_FLOAT(src)))
>> + } else {
>> + SWIZZLE_CONVERT_LOOP(uint16_t, uint8_t,
>> _mesa_float_to_half(src))
>> + }
>> + break;
>> + case GL_BYTE:
>> + if (normalized) {
>> + SWIZZLE_CONVERT_LOOP(uint16_t, int8_t,
>> _mesa_float_to_half(BYTE_TO_FLOAT(src)))
>> + } else {
>> + SWIZZLE_CONVERT_LOOP(uint16_t, int8_t,
>> _mesa_float_to_half(src))
>> + }
>> + break;
>> + case GL_UNSIGNED_SHORT:
>> + if (normalized) {
>> + SWIZZLE_CONVERT_LOOP(uint16_t, uint16_t,
>> _mesa_float_to_half(USHORT_TO_FLOAT(src)))
>> + } else {
>> + SWIZZLE_CONVERT_LOOP(uint16_t, uint16_t,
>> _mesa_float_to_half(src))
>> + }
>> + break;
>> + case GL_SHORT:
>> + if (normalized) {
>> + SWIZZLE_CONVERT_LOOP(uint16_t, int16_t,
>> _mesa_float_to_half(SHORT_TO_FLOAT(src)))
>> + } else {
>> + SWIZZLE_CONVERT_LOOP(uint16_t, int16_t,
>> _mesa_float_to_half(src))
>> + }
>> + break;
>> + case GL_UNSIGNED_INT:
>> + if (normalized) {
>> + SWIZZLE_CONVERT_LOOP(uint16_t, uint32_t,
>> _mesa_float_to_half(UINT_TO_FLOAT(src)))
>> + } else {
>> + SWIZZLE_CONVERT_LOOP(uint16_t, uint32_t,
>> _mesa_float_to_half(src))
>> + }
>> + break;
>> + case GL_INT:
>> + if (normalized) {
>> + SWIZZLE_CONVERT_LOOP(uint16_t, int32_t,
>> _mesa_float_to_half(INT_TO_FLOAT(src)))
>> + } else {
>> + SWIZZLE_CONVERT_LOOP(uint16_t, int32_t,
>> _mesa_float_to_half(src))
>> + }
>> + break;
>> + default:
>> + assert(!"Invalid channel type combination");
>> + }
>> + } break;
>> + case GL_UNSIGNED_BYTE:
>> + {
>> + const uint8_t one = normalized ? UINT8_MAX : 1;
>> + switch (src_type) {
>> + case GL_FLOAT:
>> + if (normalized) {
>> + SWIZZLE_CONVERT_LOOP(uint8_t, float,
>> FLOAT_TO_UBYTE(CLAMP(src, 0.0f, 1.0f)))
>> + } else {
>> + SWIZZLE_CONVERT_LOOP(uint8_t, float, (src < 0) ? 0 : src)
>> + }
>> + break;
>> + case GL_HALF_FLOAT:
>> + if (normalized) {
>> + SWIZZLE_CONVERT_LOOP(uint8_t, uint16_t,
>> FLOAT_TO_UBYTE(CLAMP(_mesa_half_to_float(src), 0.0f, 1.0f)))
>>
>
> Some of these lines are kind of long. We try to use 78-char lines (or
> so). In this case, maybe the whole FLOAT_TO_UBYTE(CLAMP(...)) should go
> into a helper/inline half_to_ubyte() function.
>
>
That's a good idea
>
>
> + } else {
>> + SWIZZLE_CONVERT_LOOP(uint8_t, uint16_t, (src & 0x8000) ? 0 :
>> _mesa_half_to_float(src))
>>
>
> src & 0x8000 is not immediately obvious. Maybe we need a negative_half()
> predicate function?
>
>
Good point. A negative_half function (or macro) would probably be a good
idea.
>
>
> + }
>> + break;
>> + case GL_UNSIGNED_BYTE:
>> + SWIZZLE_CONVERT_LOOP(uint8_t, uint8_t, src)
>> + break;
>> + case GL_BYTE:
>> + if (normalized) {
>> + SWIZZLE_CONVERT_LOOP(uint8_t, int8_t, (src < 0) ? 0 :
>> ((uint8_t)src * 2) + ((uint8_t)src >> 6))
>> + } else {
>> + SWIZZLE_CONVERT_LOOP(uint8_t, int8_t, (src < 0) ? 0 : src)
>> + }
>> + break;
>> + case GL_UNSIGNED_SHORT:
>> + if (normalized) {
>> + SWIZZLE_CONVERT_LOOP(uint8_t, uint16_t, src >> 8)
>> + } else {
>> + SWIZZLE_CONVERT_LOOP(uint8_t, uint16_t, src)
>> + }
>> + break;
>> + case GL_SHORT:
>> + if (normalized) {
>> + SWIZZLE_CONVERT_LOOP(uint8_t, int16_t, (src < 0) ? 0 :
>> (uint16_t)src >> 7)
>> + } else {
>> + SWIZZLE_CONVERT_LOOP(uint8_t, int16_t, (src < 0) ? 0 : src)
>> + }
>> + break;
>> + case GL_UNSIGNED_INT:
>> + if (normalized) {
>> + SWIZZLE_CONVERT_LOOP(uint8_t, uint32_t, src >> 24)
>> + } else {
>> + SWIZZLE_CONVERT_LOOP(uint8_t, uint32_t, src)
>> + }
>> + break;
>> + case GL_INT:
>> + if (normalized) {
>> + SWIZZLE_CONVERT_LOOP(uint8_t, int32_t, (src < 0) ? 0 :
>> (uint32_t)src >> 23)
>> + } else {
>> + SWIZZLE_CONVERT_LOOP(uint8_t, int32_t, (src < 0) ? 0 : src)
>> + }
>> + break;
>> + default:
>> + assert(!"Invalid channel type combination");
>> + }
>> + } break;
>> + case GL_BYTE:
>> + {
>> + const int8_t one = normalized ? INT8_MAX : 1;
>> + switch (src_type) {
>> + case GL_FLOAT:
>> + if (normalized) {
>> + SWIZZLE_CONVERT_LOOP(uint8_t, float,
>> FLOAT_TO_BYTE(CLAMP(src, -1.0f, 1.0f)))
>> + } else {
>> + SWIZZLE_CONVERT_LOOP(uint8_t, float, src)
>> + }
>> + break;
>> + case GL_HALF_FLOAT:
>> + if (normalized) {
>> + SWIZZLE_CONVERT_LOOP(uint8_t, uint16_t,
>> FLOAT_TO_BYTE(CLAMP(_mesa_half_to_float(src), -1.0f, 1.0f)))
>> + } else {
>> + SWIZZLE_CONVERT_LOOP(uint8_t, uint16_t,
>> _mesa_half_to_float(src))
>> + }
>> + break;
>> + case GL_UNSIGNED_BYTE:
>> + if (normalized) {
>> + SWIZZLE_CONVERT_LOOP(int8_t, uint8_t, src >> 1)
>> + } else {
>> + SWIZZLE_CONVERT_LOOP(int8_t, uint8_t, src)
>> + }
>> + break;
>> + case GL_BYTE:
>> + SWIZZLE_CONVERT_LOOP(int8_t, int8_t, src)
>> + break;
>> + case GL_UNSIGNED_SHORT:
>> + if (normalized) {
>> + SWIZZLE_CONVERT_LOOP(int8_t, uint16_t, src >> 9)
>> + } else {
>> + SWIZZLE_CONVERT_LOOP(int8_t, uint16_t, src)
>> + }
>> + break;
>> + case GL_SHORT:
>> + if (normalized) {
>> + SWIZZLE_CONVERT_LOOP(int8_t, int16_t, src >> 8)
>> + } else {
>> + SWIZZLE_CONVERT_LOOP(int8_t, int16_t, src)
>> + }
>> + break;
>> + case GL_UNSIGNED_INT:
>> + if (normalized) {
>> + SWIZZLE_CONVERT_LOOP(int8_t, uint32_t, src >> 25)
>> + } else {
>> + SWIZZLE_CONVERT_LOOP(int8_t, uint32_t, src)
>> + }
>> + break;
>> + case GL_INT:
>> + if (normalized) {
>> + SWIZZLE_CONVERT_LOOP(int8_t, int32_t, src >> 24)
>> + } else {
>> + SWIZZLE_CONVERT_LOOP(int8_t, int32_t, src)
>> + }
>> + break;
>> + default:
>> + assert(!"Invalid channel type combination");
>> + }
>> + } break;
>> + case GL_UNSIGNED_SHORT:
>> + {
>> + const uint16_t one = normalized ? UINT16_MAX : 1;
>> + switch (src_type) {
>> + case GL_FLOAT:
>> + if (normalized) {
>> + SWIZZLE_CONVERT_LOOP(uint16_t, float,
>> FLOAT_TO_USHORT(CLAMP(src, 0.0f, 1.0f)))
>> + } else {
>> + SWIZZLE_CONVERT_LOOP(uint16_t, float, (src < 0) ? 0 : src)
>> + }
>> + break;
>> + case GL_HALF_FLOAT:
>> + if (normalized) {
>> + SWIZZLE_CONVERT_LOOP(uint16_t, uint16_t,
>> FLOAT_TO_USHORT(CLAMP(_mesa_half_to_float(src), 0.0f, 1.0f)))
>> + } else {
>> + SWIZZLE_CONVERT_LOOP(uint16_t, uint16_t, (src & 0x8000) ? 0
>> : _mesa_half_to_float(src))
>> + }
>> + break;
>> + case GL_UNSIGNED_BYTE:
>> + if (normalized) {
>> + SWIZZLE_CONVERT_LOOP(uint16_t, uint8_t,
>> EXTEND_NORMALIZED_UINT((uint16_t)src, 8, 16))
>> + } else {
>> + SWIZZLE_CONVERT_LOOP(uint16_t, uint8_t, src)
>> + }
>> + break;
>> + case GL_BYTE:
>> + if (normalized) {
>> + SWIZZLE_CONVERT_LOOP(uint16_t, int8_t, (src < 0) ? 0 :
>> EXTEND_NORMALIZED_UINT((uint16_t)src, 7, 16))
>> + } else {
>> + SWIZZLE_CONVERT_LOOP(uint16_t, int8_t, (src < 0) ? 0 : src)
>> + }
>> + break;
>> + case GL_UNSIGNED_SHORT:
>> + SWIZZLE_CONVERT_LOOP(uint16_t, uint16_t, src)
>> + break;
>> + case GL_SHORT:
>> + if (normalized) {
>> + SWIZZLE_CONVERT_LOOP(uint16_t, int16_t, (src < 0) ? 0 :
>> EXTEND_NORMALIZED_UINT((uint16_t)src, 15, 16))
>> + } else {
>> + SWIZZLE_CONVERT_LOOP(uint16_t, int16_t, (src < 0) ? 0 : src)
>> + }
>> + break;
>> + case GL_UNSIGNED_INT:
>> + if (normalized) {
>> + SWIZZLE_CONVERT_LOOP(uint16_t, uint32_t, src >> 16)
>> + } else {
>> + SWIZZLE_CONVERT_LOOP(uint16_t, uint32_t, src)
>> + }
>> + break;
>> + case GL_INT:
>> + if (normalized) {
>> + SWIZZLE_CONVERT_LOOP(uint16_t, int32_t, (src < 0) ? 0 :
>> (uint32_t)src >> 15)
>> + } else {
>> + SWIZZLE_CONVERT_LOOP(uint16_t, int32_t, (src < 0) ? 0 : src)
>> + }
>> + break;
>> + default:
>> + assert(!"Invalid channel type combination");
>> + }
>> + } break;
>> + case GL_SHORT:
>> + {
>> + const int16_t one = normalized ? INT16_MAX : 1;
>> + switch (src_type) {
>> + case GL_FLOAT:
>> + if (normalized) {
>> + SWIZZLE_CONVERT_LOOP(uint16_t, float,
>> FLOAT_TO_SHORT(CLAMP(src, -1.0f, 1.0f)))
>> + } else {
>> + SWIZZLE_CONVERT_LOOP(uint16_t, float, src)
>> + }
>> + break;
>> + case GL_HALF_FLOAT:
>> + if (normalized) {
>> + SWIZZLE_CONVERT_LOOP(uint16_t, uint16_t,
>> FLOAT_TO_SHORT(CLAMP(_mesa_half_to_float(src), -1.0f, 1.0f)))
>> + } else {
>> + SWIZZLE_CONVERT_LOOP(uint16_t, uint16_t,
>> _mesa_half_to_float(src))
>> + }
>> + break;
>> + case GL_UNSIGNED_BYTE:
>> + if (normalized) {
>> + SWIZZLE_CONVERT_LOOP(int16_t, uint8_t,
>> EXTEND_NORMALIZED_UINT((int16_t)src, 8, 15))
>> + } else {
>> + SWIZZLE_CONVERT_LOOP(int16_t, uint8_t, src)
>> + }
>> + break;
>> + case GL_BYTE:
>> + if (normalized) {
>> + SWIZZLE_CONVERT_LOOP(int16_t, int8_t,
>> EXTEND_NORMALIZED_INT((int16_t)src, 7, 15))
>> + } else {
>> + SWIZZLE_CONVERT_LOOP(int16_t, int8_t, src)
>> + }
>> + break;
>> + case GL_UNSIGNED_SHORT:
>> + if (normalized) {
>> + SWIZZLE_CONVERT_LOOP(int16_t, uint16_t, src >> 1)
>> + } else {
>> + SWIZZLE_CONVERT_LOOP(int16_t, uint16_t, src)
>> + }
>> + break;
>> + case GL_SHORT:
>> + SWIZZLE_CONVERT_LOOP(int16_t, int16_t, src)
>> + break;
>> + case GL_UNSIGNED_INT:
>> + if (normalized) {
>> + SWIZZLE_CONVERT_LOOP(int16_t, uint32_t, src >> 17)
>> + } else {
>> + SWIZZLE_CONVERT_LOOP(int16_t, uint32_t, src)
>> + }
>> + break;
>> + case GL_INT:
>> + if (normalized) {
>> + SWIZZLE_CONVERT_LOOP(int16_t, int32_t, src >> 16)
>> + } else {
>> + SWIZZLE_CONVERT_LOOP(int16_t, int32_t, src)
>> + }
>> + break;
>> + default:
>> + assert(!"Invalid channel type combination");
>> + }
>> + } break;
>> + case GL_UNSIGNED_INT:
>> + {
>> + const uint32_t one = normalized ? UINT32_MAX : 1;
>> + switch (src_type) {
>> + case GL_FLOAT:
>> + if (normalized) {
>> + SWIZZLE_CONVERT_LOOP(uint32_t, float,
>> FLOAT_TO_UINT(CLAMP(src, 0.0f, 1.0f)))
>> + } else {
>> + SWIZZLE_CONVERT_LOOP(uint32_t, float, (src < 0) ? 0 : src)
>> + }
>> + break;
>> + case GL_HALF_FLOAT:
>> + if (normalized) {
>> + SWIZZLE_CONVERT_LOOP(uint32_t, uint16_t,
>> FLOAT_TO_UINT(CLAMP(_mesa_half_to_float(src), 0.0f, 1.0f)))
>> + } else {
>> + SWIZZLE_CONVERT_LOOP(uint32_t, uint16_t, (src & 0x8000) ? 0
>> : _mesa_half_to_float(src))
>> + }
>> + break;
>> + case GL_UNSIGNED_BYTE:
>> + if (normalized) {
>> + SWIZZLE_CONVERT_LOOP(uint32_t, uint8_t,
>> EXTEND_NORMALIZED_UINT((uint32_t)src, 8, 32))
>> + } else {
>> + SWIZZLE_CONVERT_LOOP(uint32_t, uint8_t, src)
>> + }
>> + break;
>> + case GL_BYTE:
>> + if (normalized) {
>> + SWIZZLE_CONVERT_LOOP(uint32_t, int8_t, (src < 0) ? 0 :
>> EXTEND_NORMALIZED_UINT((uint32_t)src, 7, 32))
>> + } else {
>> + SWIZZLE_CONVERT_LOOP(uint32_t, int8_t, (src < 0) ? 0 : src)
>> + }
>> + break;
>> + case GL_UNSIGNED_SHORT:
>> + if (normalized) {
>> + SWIZZLE_CONVERT_LOOP(uint32_t, uint16_t,
>> EXTEND_NORMALIZED_UINT((uint32_t)src, 16, 32))
>> + } else {
>> + SWIZZLE_CONVERT_LOOP(uint32_t, uint16_t, src)
>> + }
>> + break;
>> + case GL_SHORT:
>> + if (normalized) {
>> + SWIZZLE_CONVERT_LOOP(uint32_t, int16_t, (src < 0) ? 0 :
>> EXTEND_NORMALIZED_UINT((uint32_t)src, 15, 32))
>> + } else {
>> + SWIZZLE_CONVERT_LOOP(uint32_t, int16_t, (src < 0) ? 0 : src)
>> + }
>> + break;
>> + case GL_UNSIGNED_INT:
>> + SWIZZLE_CONVERT_LOOP(uint32_t, uint32_t, src)
>> + break;
>> + case GL_INT:
>> + if (normalized) {
>> + SWIZZLE_CONVERT_LOOP(uint32_t, int32_t, (src < 0) ? 0 :
>> EXTEND_NORMALIZED_UINT((uint32_t)src, 31, 32))
>> + } else {
>> + SWIZZLE_CONVERT_LOOP(uint32_t, int32_t, (src < 0) ? 0 : src)
>> + }
>> + break;
>> + default:
>> + assert(!"Invalid channel type combination");
>> + }
>> + } break;
>> + case GL_INT:
>> + {
>> + const int32_t one = normalized ? INT32_MAX : 1;
>> + switch (src_type) {
>> + case GL_FLOAT:
>> + if (normalized) {
>> + SWIZZLE_CONVERT_LOOP(uint32_t, float,
>> FLOAT_TO_INT(CLAMP(src, -1.0f, 1.0f)))
>> + } else {
>> + SWIZZLE_CONVERT_LOOP(uint32_t, float, src)
>> + }
>> + break;
>> + case GL_HALF_FLOAT:
>> + if (normalized) {
>> + SWIZZLE_CONVERT_LOOP(uint32_t, uint16_t,
>> FLOAT_TO_INT(CLAMP(_mesa_half_to_float(src), -1.0f, 1.0f)))
>> + } else {
>> + SWIZZLE_CONVERT_LOOP(uint32_t, uint16_t,
>> _mesa_half_to_float(src))
>> + }
>> + break;
>> + case GL_UNSIGNED_BYTE:
>> + if (normalized) {
>> + SWIZZLE_CONVERT_LOOP(int32_t, uint8_t,
>> EXTEND_NORMALIZED_UINT((int32_t)src, 8, 31))
>> + } else {
>> + SWIZZLE_CONVERT_LOOP(int32_t, uint8_t, src)
>> + }
>> + break;
>> + case GL_BYTE:
>> + if (normalized) {
>> + SWIZZLE_CONVERT_LOOP(int32_t, int8_t,
>> EXTEND_NORMALIZED_INT((int32_t)src, 7, 31))
>> + } else {
>> + SWIZZLE_CONVERT_LOOP(int32_t, int8_t, src)
>> + }
>> + break;
>> + case GL_UNSIGNED_SHORT:
>> + if (normalized) {
>> + SWIZZLE_CONVERT_LOOP(int32_t, uint16_t,
>> EXTEND_NORMALIZED_UINT((int32_t)src, 16, 31))
>> + } else {
>> + SWIZZLE_CONVERT_LOOP(int32_t, uint16_t, src)
>> + }
>> + break;
>> + case GL_SHORT:
>> + if (normalized) {
>> + SWIZZLE_CONVERT_LOOP(int32_t, int16_t,
>> EXTEND_NORMALIZED_INT((int32_t)src, 15, 31))
>> + } else {
>> + SWIZZLE_CONVERT_LOOP(int32_t, int16_t, src)
>> + }
>> + break;
>> + case GL_UNSIGNED_INT:
>> + if (normalized) {
>> + SWIZZLE_CONVERT_LOOP(int32_t, uint32_t, src >> 1)
>> + } else {
>> + SWIZZLE_CONVERT_LOOP(int32_t, uint32_t, src)
>> + }
>> + break;
>> + case GL_INT:
>> + SWIZZLE_CONVERT_LOOP(int32_t, int32_t, src)
>> + break;
>> + default:
>> + assert(!"Invalid channel type combination");
>> + }
>> + } break;
>> + default:
>> + assert(!"Invalid channel type");
>> + }
>> +}
>> diff --git a/src/mesa/main/format_utils.h b/src/mesa/main/format_utils.h
>> index 6af3aa5..c5dab7b 100644
>> --- a/src/mesa/main/format_utils.h
>> +++ b/src/mesa/main/format_utils.h
>> @@ -33,6 +33,19 @@
>>
>> #include "macros.h"
>>
>> +/* Only guaranteed to work for BITS <= 32 */
>> +#define MAX_UINT(BITS) ((BITS) == 32 ? UINT32_MAX : ((1u << BITS) - 1))
>> +
>> +/* Extends an integer of size Sb to one of size Db in a linear way */
>> +#define EXTEND_NORMALIZED_UINT(X, Sb, Db) \
>> + (((X) * (__typeof__(X))(MAX_UINT(Db) / MAX_UINT(Sb))) + \
>> + ((Db % Sb) ? ((X) >> (Sb - Db % Sb)) : (__typeof__(X))0))
>> +
>> +/* This is almost the same as extending unsigned int except that we have
>> to
>> + * handle the case of -MAX(Sb) */
>> +#define EXTEND_NORMALIZED_INT(X, Sb, Db) (((X) <
>> -(__typeof__(X))MAX_UINT(Sb)) \
>> + ? -(__typeof__(X))MAX_UINT(Db) : EXTEND_NORMALIZED_UINT(X, Sb, Db))
>>
>
> Does __typeof__ work in MSVC, Clang, etc?
>
No, not in MSVC. I meant to change that before I sent it out. At one
point, I thought I had performance problems when everything casted to int,
but now that I look at it, I'm not seeing a difference anymore.
>
> It might be simpler to just define/use BYTE_TO_SHORT(), etc. macros.
I thought about that. However, there doesn't seem to be much of a rhyme or
reason to some of the BYTE_TO_SHORT-style macros: which ones exist, how
they're written, etc. And a couple of them (BYTE_TO_UBYTE) are even
wrong. If you'd rather I put together a patch to clean them up and
standardize them, I can do that. The EXTEND_NORMALIZED_INT macro is also
really nice if we're going to autogenerate things because you can use it on
3, 5, and 10-biit datatypes.
>
>
>
> +
>> /* RGB to sRGB conversion functions */
>>
>> static inline float
>> @@ -65,4 +78,9 @@ _mesa_srgb_to_linear(float cs)
>>
>> float _mesa_srgb_ubyte_to_linear_float(uint8_t cl);
>>
>> +void
>> +_mesa_swizzle_and_convert(void *dst, GLenum dst_type, int
>> num_dst_channels,
>> + const void *src, GLenum src_type, int
>> num_src_channels,
>> + const uint8_t swizzle[4], bool normalized, int
>> count);
>> +
>> #endif
>>
>>
> -Brian
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.freedesktop.org/archives/mesa-dev/attachments/20140718/1a55cf30/attachment-0001.html>
More information about the mesa-dev
mailing list