[Mesa-dev] [cfe-dev] 3 element vectors in opencl 1.1+
Francisco Jerez
currojerez at riseup.net
Thu Apr 24 08:53:13 PDT 2014
Jan Vesely <jan.vesely at rutgers.edu> writes:
> On Thu, 2014-04-24 at 12:05 +0200, Francisco Jerez wrote:
> <snip>
>> >>>
>> >>> I don't think that using getTypeAllocSize() instead of
>> >>> getTypeStoreSize() to calculate clover::argument::size would be a
>> >>> satisfactory solution. By doing that you'd redefine the API argument
>> >>> size exposed to the host for *all* argument types to be the
>> >>> device-dependent aligned size, which is definitely not what you want.
>> >>
>> >>> AFAIK 3-element vectors are an exception because they are the only types
>> >>> that are defined to have a different API size from their actual usable
>> >>> size, so they probably deserve special handling in invocation.cpp (as
>> >>> you did in your first patch). As the API size is target-independent I
>> >>> don't think that the fix belongs in Clang or LLVM, Clover is likely at
>> >>> fault.
>> >>
>> >> The way I understood the ch 6.1.5 is that both API and OpenCL C 3
>> >> element vectors are required to be 4*sizeof(component). So a
>> >> sizeof(float3) == sizeof(cl_float3) == 16, and should be both host and
>> >> target independent.
>> >
>> > Well, sizeof() is supposed to take into account the alignment, so this
>> > should be the case already.
>
> AFAIK sizeof only includes padding. I explain below.
>
Yes, and 3-component vectors are being padded to 4 components already.
>> >
>> >> That's why clang (or more precisely libclc) looked like a correct
>> >> place.
>> >>
>> >
>> > Right. I guess the other possibility would be to redefine cl_float3 as
>> > cl_float4 in libclc.
>>
>> Sorry, I meant to redefine "gentype3" as "gentype4" for every vector
>> element type "gentype".
>>
>> > You mentioned that it had undesired consequences. Which exactly?
>
> With this change:
> -typedef __attribute__((ext_vector_type(3))) float float3;
> +typedef __attribute__((ext_vector_type(4))) float float3;
>
> errors about duplicate ops (since type3 and type4 are now the same type)
> can be fixed. However, I don't know a clean way to fix the following:
>
> error: too few elements in vector initialization (expected 4 elements,
> have 3)
>
> Even if we can get rid of it, or reduce it to warning.
> having the same type for type3 and type4 causes problem that we won't be
> able to distinguish following situations:
>
> flaot4 A
>
> float3 B = A.xyz; // This is OK, should not produce warning/error;
>
> float4 C = A.xyz; // This should at least produce a warning. even if we
> allow using fewer elements for vector initialization.
>
Right, I agree that this wouldn't be a good idea.
>
>> >
>> >> I understand that target device can have stricter alignment rules, but I
>> >> don't see how it can have different type sizes (my reading of the specs
>> >> is that these are binding for the target as well).
>> >>
>> >
>> > In theory the sizes are binding for most built-in types, but R600
>> > doesn't support certain integer types so clover has code to take into
>> > account the differing size, alignment and endianness between host and
>> > device. I guess that in most cases it would be possible to use the
>> > "official" ABI for passing kernel arguments to the device (and that's a
>> > requirement for your solution using DataLayout::getTypeAllocSize() to
>> > work reliably, otherwise you'll be taking into account device-specific
>> > padding), but you'll have to fix the R600 back-end so it's able to deal
>> > with (i.e. lower) all built-in types specified by OpenCL. I think that
>> > this would be useful on its own, Tom should know better than I how
>> > difficult it would be.
>
> I didn't know about the additional R600 type size magic, in this case I
> agree that the original approach (detect 3elem vectors and change size)
> is probably the best in this situation.
>
>> >
>> >> I can resend the original patch with debug output replaced by a comment.
>> >>
>> > A slightly more general solution than what you did could be to align the
>> > store size of scalar and vector arguments to the next power of two to
>> > calculate the API size (in particular, that would make 3- and 4-element
>> > vectors have the same size). From 6.1.5: "A built-in data type that is
>> > not a power of two bytes in size must be aligned to the next larger
>> > power of two."
>
> This part is what made me think that the _alignment_ only restricts
> starting address (and not size).
It also restricts the size. From section 6.3 on the sizeof operator:
"The sizeof operator yields the size (in bytes) of its operand,
including any padding bytes (refer to section 6.1.5) needed for
alignment [...]. When applied to an operand that has structure or union
type, the result is the total number of bytes in such an object,
including internal and trailing padding."
> Otherwise, the addition of 3 element vector special case would not be
> necessary.
Maybe it wouldn't be strictly necessary, but it's still clarifying as an
illustrative corollary of the general rule.
> i.e I think that the
> following mem layout is legal in OCL 1.0 but not in OCL1.1+
>
> 0x10: float3 here
> 0x1c: int here // this should belong to float3 in ocl 1.1+
Does OpenCL 1.0 support 3-component vectors at all?
>
> while the following is illegal in both:
>
> 0x8: float3 here
>
> So, I don't think that the quoted text requires all builtin types to
> have 2^x size (although all but 3 element vectors do).
>
Well, don't interpret it that way if you don't want to, but all OpenCL
built-in types do (or at least the ones you are allowed to use as kernel
argument types), so this solution seems more elegant to me than
special-casing 3-component vectors.
Thanks.
> regards,
> Jan
>
>
>> >
>> >> regards,
>> >> Jan
>> >>
>> >
>> > Thanks.
>> >
>> >>>
>> >>> Thanks.
>> >>>
>> >>> > regards,
>> >>> > Jan
>> >>> >
>> >>> >
>> >>> > --
>> >>> > Jan Vesely <jan.vesely at rutgers.edu>
>> >>> > _______________________________________________
>> >>> > mesa-dev mailing list
>> >>> > mesa-dev at lists.freedesktop.org
>> >>> > http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>> >>
>> >> --
>> >> Jan Vesely <jan.vesely at rutgers.edu>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 229 bytes
Desc: not available
URL: <http://lists.freedesktop.org/archives/mesa-dev/attachments/20140424/01ef0320/attachment.sig>
More information about the mesa-dev
mailing list