[Mesa-dev] [PATCH] nir/xfb: Properly align 64-bit values

Wed Feb 13 14:03:40 UTC 2019

I vaguely remember glslang doing something similar when computing 
offsets. This, and other xfb corner cases, were the reason I 
thought/concluded it would made sense to let glslang (or any other 
frontend) to do the offset assignment also for structs, so SPIR-V 
consuming drivers didn't need to handle all that mess, and could just 
expect to everything having his offset/stride assigned. In any case, we 
also found that doing that can be problematic, and Im just rambling... 
The patch LGTM:

Reviewed-by: Alejandro Piñeiro <apinheiro at igalia.com>

On 12/2/19 20:22, Jason Ekstrand wrote:
> Fixes: 19064b8c "nir: Add a pass for gathering transform feedback info"
> Cc: Alejandro Piñeiro <apinheiro at igalia.com>
> ---
>   src/compiler/nir/nir_gather_xfb_info.c | 44 ++++++++++++++++++++++++++
>   1 file changed, 44 insertions(+)
>
> diff --git a/src/compiler/nir/nir_gather_xfb_info.c b/src/compiler/nir/nir_gather_xfb_info.c
> index 96f0ece5e75..fb736dfeb17 100644
> --- a/src/compiler/nir/nir_gather_xfb_info.c
> +++ b/src/compiler/nir/nir_gather_xfb_info.c
> @@ -72,6 +72,50 @@ add_var_xfb_outputs(nir_xfb_info *xfb,
>         assert(var->data.location_frac + comp_slots <= 8);
>         uint8_t comp_mask = ((1 << comp_slots) - 1) << var->data.location_frac;
>   
> +      /* From version 4.60 of the GLSL spec:
> +       *
> +       *    "Variables and block members qualified with xfb_offset can be
> +       *    scalars, vectors, matrices, structures, and (sized) arrays of
> +       *    these. The offset must be a multiple of the size of the first
> +       *    component of the first qualified variable or block member, or a
> +       *    compile-time error results. Further, if applied to an aggregate
> +       *    containing a double, the offset must also be a multiple of 8, and
> +       *    the space taken in the buffer will be a multiple of 8. The given
> +       *    offset applies to the first component of the first member of the
> +       *    qualified entity. Then, within the qualified entity, subsequent
> +       *    components are each assigned, in order, to the next available
> +       *    offset aligned to a multiple of that component's size. Aggregate
> +       *    types are flattened down to the component level to get this
> +       *    sequence of components."
> +       *
> +       * We need to align each element to the component size in order to get
> +       * the correct layout.  We do this at the component level and don't try
> +       * to align entire aggregate types such as structs because of the last
> +       * sentence which says that aggregate types are treated as flattened to
> +       * components.  In other words, if we have
> +       *
> +       *    struct A {
> +       *       int a;
> +       *       double b;
> +       *    };
> +       *
> +       *    struct B {
> +       *       int b
> +       *       A a;
> +       *    };
> +       *
> +       *    layout (...) out B o;
> +       *
> +       * then we treat it as if struct A was embedded struct B and o.a.b has
> +       * an offset of 8.  If we tried to apply the alignment rule to nested
> +       * structs and didn't flatten, o.a would have an offset of 8 because it
> +       * contains a double and o.a.b would then have an offset of 16.
> +       * However, thanks to the above GLSL rule, o.b and o.a.a are tightly
> +       * packed and there is no gap.
> +       */
> +      if (glsl_type_is_64bit(type))
> +         *offset = ALIGN_POT(*offset, 8);
> +
>         assert(attrib_slots <= 2);
>         for (unsigned s = 0; s < attrib_slots; s++) {
>            nir_xfb_output_info *output = &xfb->outputs[xfb->output_count++];