[Mesa-dev] [PATCH] intel/compiler: Properly consider UBO loads that cross 32B boundaries.

Jason Ekstrand jason at jlekstrand.net
Wed Jun 13 16:24:43 UTC 2018


I just reverted this in master because it regressed about 30K Vulkan CTS
tests.  More investigation needed?

On Wed, Jun 13, 2018 at 2:07 AM, Kenneth Graunke <kenneth at whitecape.org>
wrote:

> On Tuesday, June 12, 2018 1:38:03 PM PDT Rafael Antognolli wrote:
> > On Mon, Jun 11, 2018 at 02:01:49PM -0700, Kenneth Graunke wrote:
> > > The UBO push analysis pass incorrectly assumed that all values would
> fit
> > > within a 32B chunk, and only recorded a bit for the 32B chunk
> containing
> > > the starting offset.
> > >
> > > For example, if a UBO contained the following, tightly packed:
> > >
> > >    vec4 a;  // [0, 16)
> > >    float b; // [16, 20)
> > >    vec4 c;  // [20, 36)
> > >
> > > then, c would start at offset 20 / 32 = 0 and end at 36 / 32 = 1,
> > > which means that we ought to record two 32B chunks in the bitfield.
> > >
> > > Similarly, dvec4s would suffer from the same problem.
> > > ---
> > >  src/intel/compiler/brw_nir_analyze_ubo_ranges.c | 8 +++++++-
> > >  1 file changed, 7 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/src/intel/compiler/brw_nir_analyze_ubo_ranges.c
> b/src/intel/compiler/brw_nir_analyze_ubo_ranges.c
> > > index d58fe3dd2e3..6d6ccf73ade 100644
> > > --- a/src/intel/compiler/brw_nir_analyze_ubo_ranges.c
> > > +++ b/src/intel/compiler/brw_nir_analyze_ubo_ranges.c
> > > @@ -141,10 +141,16 @@ analyze_ubos_block(struct ubo_analysis_state
> *state, nir_block *block)
> > >           if (offset >= 64)
> > >              continue;
> > >
> > > +         /* The value might span multiple 32-byte chunks. */
> > > +         const int bytes = nir_intrinsic_dest_components(intrin) *
> > > +                           (nir_dest_bit_size(intrin->dest) / 8);
> > > +         const int end = DIV_ROUND_UP(offset_const->u32[0] + bytes,
> 32);
> > > +         const int regs = end - offset + 1;
> > > +
> >
> > But if I understood it correctly, offset is the first 32B chunk within
> > the UBO block (it's actually an ubo "chunk offset"). And you calculate
> > bytes by taking the number of components times the size of each
> > component of the nir_intrinsic_load_ubo instruction (which apparently
> > supports multiple components). So yeah, this makes sense to me.
>
> Yeah, that's exactly right.  load_ubo can load up to 4 components.
>
> > Take this review with a grain of salt (assuming what I wrote above is
> > correct), but this looks simple enough. So it is
> >
> > Reviewed-by: Rafael Antognolli <rafael.antognolli at intel.com>
>
> Thanks!
>
> _______________________________________________
> mesa-dev mailing list
> mesa-dev at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/mesa-dev/attachments/20180613/3cf8817d/attachment-0001.html>


More information about the mesa-dev mailing list