[Mesa-dev] [PATCH 2/2] clover: fix getting struct args api size

Francisco Jerez currojerez at riseup.net
Fri Jun 24 01:03:23 UTC 2016


Jan Vesely <jan.vesely at rutgers.edu> writes:

> On Wed, 2016-06-22 at 20:22 -0700, Francisco Jerez wrote:
>> Jan Vesely <jan.vesely at rutgers.edu> writes:
>> 
>> > On Wed, 2016-06-22 at 17:07 -0700, Francisco Jerez wrote:
>> > > Jan Vesely <jan.vesely at rutgers.edu> writes:
>> > > 
>> > > > On Mon, 2016-06-13 at 17:24 -0700, Francisco Jerez wrote:
>> > > > > Serge Martin <edb+mesa at sigluy.net> writes:
>> > > > > 
>> > > > > > This fix getting the size of a struct arg. vec3 types still
>> > > > > > work
>> > > > > > ok.
>> > > > > > Only buit-in args need to have power of two alignment,
>> > > > > > getTypeAllocSize
>> > > > > > reports the correct size.
>> > > > > > ---
>> > > > > >  src/gallium/state_trackers/clover/llvm/invocation.cpp | 3
>> > > > > > ++-
>> > > > > >  1 file changed, 2 insertions(+), 1 deletion(-)
>> > > > > > 
>> > > > > > diff --git
>> > > > > > a/src/gallium/state_trackers/clover/llvm/invocation.cpp
>> > > > > > b/src/gallium/state_trackers/clover/llvm/invocation.cpp
>> > > > > > index 03487d6..9af51539 100644
>> > > > > > --- a/src/gallium/state_trackers/clover/llvm/invocation.cpp
>> > > > > > +++ b/src/gallium/state_trackers/clover/llvm/invocation.cpp
>> > > > > > @@ -472,7 +472,8 @@ namespace {
>> > > > > >           // aligned to the next larger power of two".  We
>> > > > > > need
>> > > > > > this
>> > > > > >           // alignment for three element vectors, which
>> > > > > > have
>> > > > > >           // non-power-of-2 store size.
>> > > > > > -         const unsigned arg_api_size =
>> > > > > > util_next_power_of_two(arg_store_size);
>> > > > > > +         const unsigned arg_api_size = arg_type-
>> > > > > > >isStructTy()
>> > > > > > ?
>> > > > > > +               arg_store_size :
>> > > > > > util_next_power_of_two(arg_store_size);
>> > > > > >  
>> > > > > Hm...  Isn't this still going to be broken if you pass a
>> > > > > struct
>> > > > > argument
>> > > > > to a kernel function and the alignment of any of the struct
>> > > > > members
>> > > > > doesn't match the target-specific data layout?  Not sure we
>> > > > > can
>> > > > > fix
>> > > > > this
>> > > > > sensibly without requiring the target's data layout to match
>> > > > > the
>> > > > > CL
>> > > > > API
>> > > > > exactly.  Any suggestions Tom?
>> > > > 
>> > > > according to 6.7.2.1 compilers can arbitrarily insert padding
>> > > > between
>> > > > struct members (except at the beginning).
>> > > 
>> > > What spec version are you looking at?  My CL spec doesn't have
>> > > any
>> > > section labeled 6.7.2.1.
>> > 
>> > c99 specs, I did not find anything specific for CLC (it might be
>> > that I
>> > just need to look harder). CLC 2.0 adds additional constraint that
>> > you
>> > can't use address space qualifiers.
>> > 
>> 
>> I'd expect that whatever the CL spec says regarding the memory layout
>> of
>> CLC types (e.g. section 6.1.5 which specifies the usual alignment
>> rules
>> for CL types and section 6.11.1 and 6.11.3 which specify various
>> variable and type declaration attributes giving finer control over
>> the
>> alignment of variable and struct member declarations) fully overrides
>> the C99 spec.
>
> Right, even if we consider that none of the C99 6.7.2.1 apply (and at
> least CL2.0 6.5.6 does not make it sound so), it only gives us one
> side, we can check that the CLC struct layout follows what we would
> expect. We don't have means to check and enforce that the host side
> struct layout is compatible.
>

Yes, exactly, the CL spec doesn't have anything to say about the
host-side memory layout, that's up to the host platform's ABI to define.

>> 
>> > > 
>> > > > Even if size/alignment of individual members match CL API
>> > > > exactly,
>> > > > there's no guarantee that the structure layout/size will be the
>> > > > same.
>> > > > 
>> > > How can you exchange structured data with a CL kernel then,
>> > > assuming
>> > > that the layout of structure types in memory is fully unspecified
>> > > as
>> > > you
>> > > say?
>> > 
>> > that is my point. My understanding is that it relies on a silent
>> > assumption that both CLC and the host compiler will create the same
>> > structure layout given the same structure elements.
>> > 
>> > big endian host can create:
>> > struct foo {
>> > 	cl_int a;
>> > 	// 16 bit padding;
>> > 	cl_short b;
>> > 	cl_int c;
>> > };
>> 
>> I don't think this is a valid representation of the structure
>> according
>> to CL rules, my understanding is if your host happens to lay out
>> structure fields in this way you have to either marshal things
>> manually
>> or use compiler-specific attributes to get the host compiler to put
>> things at the right location (according to CL API rules).  I believe
>> that Khronos' cl_platform.h specifies alignment attributes in the
>> cl_*
>> host-side typedefs specifically for this purpose.
>
> maybe I'm missing something, but what rule does it break? shorts are by
> default 2 byte aligned, ints are 4 byte aligned, they cannot be
> reordered, so there needs to be a 2 byte hole somewhere. is there a
> rule that requires members to be at the nearest aligned location?

That's how it works on most CPU ABIs I'm aware of, you take the first
available offset compatible with the alignment requirements of the
element datatype.  I'm not sure if the CL spec defines the memory layout
of structure types to that level of detail, but I would assume that's by
omission rather than it being intentionally unspecified.

> At least on the host side the compiler can insert stuff like '60 bytes
> padding to avoid false sharing'.

That should also be a valid transformation in a (device-side) CL program
if the compiler can prove that the memory layout of a struct type is
never visible by the host, so none of the (more or less explicitly
specified) CL ABI applies.  The same goes for a host-side C compiler, it
can lay out the program's data structures any way it likes as long as
the externally visible C ABI of the target platform is preserved.

> If the application programmer needs to use specific attributes to get
> compatible layout we have no way to check that, other than checking
> the provided size.
>
> what I'm trying to says is; does it make sense to have stricter checks
> on device side if we can't enforce them on the host side?
>
We can't but the application can as long as any binary data exchanged
through the CL API follows some well-specified and device-independent
memory layout.  It's the application's responsibility to make sure that
any binary blobs passed to the CL API comply with the CL data layout:
one (somewhat non-portable) possibility is to rely on the host C
compiler to lay out your data in the same way the CL API expects by
using compiler-specific type annotations, but the CL implementation
cannot give you any guarantees that it will be possible to achieve that
on all platforms, for maximum portability the application can always
marshal any data exchanged with CL kernels by hand.

> Jan
>
>> 
>> > while little endian device could create:
>> > struct foo {
>> > 	int a;
>> > 	short b;
>> > 	// 16 bit padding
>> > 	int c;
>> > };
>> > 
>> > If cl_short/short alignment is 2bytes, the above structures and all
>> > the
>> > members have the same size/alignment, yet are not compatible.
>> > 
>> > Am I missing something that would prevent the above?
>> > 
>> > Jan
>> > 
>> > > 
>> > > > Jan
>> > > > 
>> > > > > 
>> > > > > >           llvm::Type *target_type = arg_type->isIntegerTy() 
>> > > > > > ?
>> > > > > >                 TD.getSmallestLegalIntType(mod-
>> > > > > > >getContext(),
>> > > > > > arg_store_size * 8)
>> > > > > > -- 
>> > > > > > 2.5.5
>> > > > -- 
>> > > > 
>> > > > Jan Vesely <jan.vesely at rutgers.edu>
>> > -- 
>> > 
>> > Jan Vesely <jan.vesely at rutgers.edu>
> -- 
>
> Jan Vesely <jan.vesely at rutgers.edu>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 212 bytes
Desc: not available
URL: <https://lists.freedesktop.org/archives/mesa-dev/attachments/20160623/bcce7746/attachment.sig>


More information about the mesa-dev mailing list