[Mesa-dev] [PATCH] intel/compiler: Properly consider UBO loads that cross 32B boundaries.
Rafael Antognolli
rafael.antognolli at intel.com
Tue Jun 12 20:43:04 UTC 2018
On Tue, Jun 12, 2018 at 01:38:03PM -0700, Rafael Antognolli wrote:
> On Mon, Jun 11, 2018 at 02:01:49PM -0700, Kenneth Graunke wrote:
> > The UBO push analysis pass incorrectly assumed that all values would fit
> > within a 32B chunk, and only recorded a bit for the 32B chunk containing
> > the starting offset.
> >
> > For example, if a UBO contained the following, tightly packed:
> >
> > vec4 a; // [0, 16)
> > float b; // [16, 20)
> > vec4 c; // [20, 36)
> >
> > then, c would start at offset 20 / 32 = 0 and end at 36 / 32 = 1,
> > which means that we ought to record two 32B chunks in the bitfield.
> >
> > Similarly, dvec4s would suffer from the same problem.
> > ---
> > src/intel/compiler/brw_nir_analyze_ubo_ranges.c | 8 +++++++-
> > 1 file changed, 7 insertions(+), 1 deletion(-)
> >
> > diff --git a/src/intel/compiler/brw_nir_analyze_ubo_ranges.c b/src/intel/compiler/brw_nir_analyze_ubo_ranges.c
> > index d58fe3dd2e3..6d6ccf73ade 100644
> > --- a/src/intel/compiler/brw_nir_analyze_ubo_ranges.c
> > +++ b/src/intel/compiler/brw_nir_analyze_ubo_ranges.c
> > @@ -141,10 +141,16 @@ analyze_ubos_block(struct ubo_analysis_state *state, nir_block *block)
> > if (offset >= 64)
> > continue;
> >
> > + /* The value might span multiple 32-byte chunks. */
> > + const int bytes = nir_intrinsic_dest_components(intrin) *
> > + (nir_dest_bit_size(intrin->dest) / 8);
> > + const int end = DIV_ROUND_UP(offset_const->u32[0] + bytes, 32);
> > + const int regs = end - offset + 1;
> > +
>
> But if I understood it correctly, offset is the first 32B chunk within
s/But //
> the UBO block (it's actually an ubo "chunk offset"). And you calculate
> bytes by taking the number of components times the size of each
> component of the nir_intrinsic_load_ubo instruction (which apparently
> supports multiple components). So yeah, this makes sense to me.
>
> Take this review with a grain of salt (assuming what I wrote above is
> correct), but this looks simple enough. So it is
>
> Reviewed-by: Rafael Antognolli <rafael.antognolli at intel.com>
>
> > /* TODO: should we count uses in loops as higher benefit? */
> >
> > struct ubo_block_info *info = get_block_info(state, block);
> > - info->offsets |= 1ull << offset;
> > + info->offsets |= ((1ull << regs) - 1) << offset;
> > info->uses[offset]++;
> > }
> > }
> > --
> > 2.17.0
> >
> > _______________________________________________
> > mesa-dev mailing list
> > mesa-dev at lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> _______________________________________________
> mesa-dev mailing list
> mesa-dev at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
More information about the mesa-dev
mailing list