[Mesa-dev] [PATCH] intel/compiler: Properly consider UBO loads that cross 32B boundaries.

Tue Jun 12 20:38:03 UTC 2018

On Mon, Jun 11, 2018 at 02:01:49PM -0700, Kenneth Graunke wrote:
> The UBO push analysis pass incorrectly assumed that all values would fit
> within a 32B chunk, and only recorded a bit for the 32B chunk containing
> the starting offset.
> 
> For example, if a UBO contained the following, tightly packed:
> 
>    vec4 a;  // [0, 16)
>    float b; // [16, 20)
>    vec4 c;  // [20, 36)
> 
> then, c would start at offset 20 / 32 = 0 and end at 36 / 32 = 1,
> which means that we ought to record two 32B chunks in the bitfield.
> 
> Similarly, dvec4s would suffer from the same problem.
> ---
>  src/intel/compiler/brw_nir_analyze_ubo_ranges.c | 8 +++++++-
>  1 file changed, 7 insertions(+), 1 deletion(-)
> 
> diff --git a/src/intel/compiler/brw_nir_analyze_ubo_ranges.c b/src/intel/compiler/brw_nir_analyze_ubo_ranges.c
> index d58fe3dd2e3..6d6ccf73ade 100644
> --- a/src/intel/compiler/brw_nir_analyze_ubo_ranges.c
> +++ b/src/intel/compiler/brw_nir_analyze_ubo_ranges.c
> @@ -141,10 +141,16 @@ analyze_ubos_block(struct ubo_analysis_state *state, nir_block *block)
>           if (offset >= 64)
>              continue;
>  
> +         /* The value might span multiple 32-byte chunks. */
> +         const int bytes = nir_intrinsic_dest_components(intrin) *
> +                           (nir_dest_bit_size(intrin->dest) / 8);
> +         const int end = DIV_ROUND_UP(offset_const->u32[0] + bytes, 32);
> +         const int regs = end - offset + 1;
> +

But if I understood it correctly, offset is the first 32B chunk within
the UBO block (it's actually an ubo "chunk offset"). And you calculate
bytes by taking the number of components times the size of each
component of the nir_intrinsic_load_ubo instruction (which apparently
supports multiple components). So yeah, this makes sense to me.

Take this review with a grain of salt (assuming what I wrote above is
correct), but this looks simple enough. So it is

Reviewed-by: Rafael Antognolli <rafael.antognolli at intel.com>

>           /* TODO: should we count uses in loops as higher benefit? */
>  
>           struct ubo_block_info *info = get_block_info(state, block);
> -         info->offsets |= 1ull << offset;
> +         info->offsets |= ((1ull << regs) - 1) << offset;
>           info->uses[offset]++;
>        }
>     }
> -- 
> 2.17.0
> 
> _______________________________________________
> mesa-dev mailing list
> mesa-dev at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev