[Mesa-dev] [PATCH 1/2] i965/vec4: set swizzle when loading an uniform

Samuel Iglesias Gonsálvez siglesias at igalia.com
Tue Apr 25 08:41:48 UTC 2017


On Mon, 2017-04-24 at 11:22 -0700, Francisco Jerez wrote:
> Samuel Iglesias Gonsálvez <siglesias at igalia.com> writes:
> 
> > On Fri, 2017-04-21 at 10:23 -0700, Francisco Jerez wrote:
> > > Samuel Iglesias Gonsálvez <siglesias at igalia.com> writes:
> > > 
> > > > On Thu, 2017-04-20 at 10:26 -0700, Francisco Jerez wrote:
> > > > > Samuel Iglesias Gonsálvez <siglesias at igalia.com> writes:
> > > > > 
> > > > > > It was setting XYWZ swizzle to all uniforms, no matter if
> > > > > > they
> > > > > > were
> > > > > > a vector or not.
> > > > > > 
> > > > > > Signed-off-by: Samuel Iglesias Gonsálvez <siglesias at igalia.
> > > > > > com>
> > > > > > Cc: currojerez at riseup.net
> > > > > 
> > > > > Don't you need to CC mesa-stable here and in the next patch?
> > > > > 
> > > > 
> > > > I considered it but I has doubts about which tag use "17.1.0-
> > > > rc1"
> > > > or
> > > > just "17.1.0" or whatever. So my plan is to notify Emil once
> > > > they
> > > > are
> > > > merged (and add Cc to stable in the commit log before pushing
> > > > it to
> > > > master).
> > > > 
> > > > If you are more comfortable with Cc mesa-stable, I will do it
> > > > next
> > > > time
> > > > (or if I need to send v2 of this series).
> > > > 
> > > 
> > > I believe Emil will notice them even if you don't put a version
> > > tag.  I
> > > don't really care how you nominate it for stable as long as it
> > > hits
> > > the
> > > 17.1 branch before the end of the release cycle.  ;)
> > > 
> > > > > > ---
> > > > > >  src/intel/compiler/brw_vec4_nir.cpp | 1 +
> > > > > >  1 file changed, 1 insertion(+)
> > > > > > 
> > > > > > diff --git a/src/intel/compiler/brw_vec4_nir.cpp
> > > > > > b/src/intel/compiler/brw_vec4_nir.cpp
> > > > > > index a82d52088a8..5f4488c7e86 100644
> > > > > > --- a/src/intel/compiler/brw_vec4_nir.cpp
> > > > > > +++ b/src/intel/compiler/brw_vec4_nir.cpp
> > > > > > @@ -863,6 +863,7 @@
> > > > > > vec4_visitor::nir_emit_intrinsic(nir_intrinsic_instr
> > > > > > *instr)
> > > > > >           unsigned offset = const_offset->u32[0] + shift *
> > > > > > 4;
> > > > > >           src.offset = ROUND_DOWN_TO(offset, 16);
> > > > > >           shift = (offset % 16) / 4;
> > > > > > +         src.swizzle = brw_swizzle_for_size(instr-
> > > > > > > num_components);
> > > > > 
> > > > > What about the indirect case a few lines below?  Isn't the
> > > > > swizzle
> > > > > passed
> > > > > to the mov indirect instruction still bogus?
> > > > > 
> > > > 
> > > > This is different. It is expecting to have a swizzle of XYZW
> > > > because
> > > > MOV_INDIRECT will copy all the contents. See assert in
> > > > move_uniform_array_access_to_pull_constants()
> > > 
> > > I believe the ultimate problem here is that the MOV_INDIRECT gets
> > > a
> > > writemask of XYZW even if you're reading a scalar, so setting the
> > > minimal swizzle will lead to a situation in which the resulting
> > > swizzle
> > > is not the identity so you will run into trouble to turn it into
> > > scratch
> > > access.  That brings me to the following question which I don't
> > > think
> > > I
> > > can answer by looking at this patch alone: If having an XYZW
> > > swizzle
> > > is
> > > a problem for direct moves (I assume this patch is fixing
> > > something?),
> > 
> > The bug is to properly identify DFs and dvecs, instead of
> > considering
> > all of them as dvec4 when aligning them for the push constant
> > buffer,
> > which is done in next patch.
> > 
> 
> Sorry, I'm not following what you mean with the last paragraph.  What
> does this have to do with identifying DFs?
> 

By the register type, I know it is a DF, but I need then to know how
many components are used in order to know if they are dvec3 or dvec4,
so I push them first (so they are aligned to dvec4), or just dvec2 or a
double scalar, so I push them later. This is whay I meant when I said
identifying DF, sorry for the misunderstanding.

> > > how come it is not a problem for indirect moves?
> > > 
> > 
> > It is not a problem because the uniforms under indirect moves are
> > moved
> > to pull constant buffer in
> > move_uniform_array_access_to_pull_constants().
> > 
> 
> Is that really always the case?

For GL, yes.

>   Even on Vulkan?
> 

Good question! Looking at the code, Vulkan driver doesn't support pull
constants other than UBOs so everything has to be pushed regardless.

In the push constant buffer code, it marks every register touched by
MOV INDIRECT as fully used, ignoring its swizzle because it is just
interested on the number of read vec4 slots and consider we fully read
them, which is also mentioned in the quote I pasted in a previous
email:

/* We just mark every register touched by a MOV_INDIRECT as being
 * fully used.  This ensures that it doesn't broken up piecewise by
 * the next part of our packing algorithm.
 */

When the swizzle is used then? The swizzle is taking into account in
the generator to add it as an offset when the indirect offset is not an
immediate, or add it to the register swizzle if the indirect offset is
an immediate. As we considered the uniform arrays as fully accessed
(swizzle and writemasks are XYZW), we use the number of read registers
and they are aligned to vec4 (see below), I think there is no problem
with them and we don't need to update the swizzles.

> > Those DF pull constants are read with 2 messages, so they don't
> > need to
> > be dvec4-aligned like in the case of direct moves on the push
> > constant
> > buffer, and the swizzle is actually ignored when loading it to pull
> > constant buffer.
> > 
> 
> Is the swizzle actually meant to be ignored or is that a bug of the
> move_uniform_array_access_to_pull_constants() pass?  If it is meant
> to
> be ignored why do we set a non-identity swizzle below for shift != 0
> in
> the indirect path a couple of lines below?
> 

'shift' indicates if the uniform is aligned to vec4 (shift == 0) and,
if not, which is the offset in components of 32-bits, so we can add it
to the swizzle. However, uniforms follow std140 rule (as they are part
of the default uniform block), which says the following for arrays.
From OpenGL 4.5 spec:

"4. If the member is an array of scalars or vectors, the base alignment
and array stride are set to match the base alignment of a single array
element, according to rules (1), (2), and (3), and rounded up to the
base alignment of a vec4."

So an uniform array is aligned to vec4, then shift == 0 and the swizzle
is not changed by 'shift' in this case. I.e. shift doesn't change
anything.

I guess 'shift' is used, together with the assert in
move_uniform_array_access_to_pull_constants(), to detect uniform arrays
not aligned to vec4 size. This check can be replaced with assert(shift
== 0) here.

> > However, if you prefer to keep consistency between both cases, I
> > can
> > apply the shuffle globally and remove the NOOP assert in
> > move_uniform_array_access_to_pull_constants(). This change doesn't
> > add
> > any regression on piglit.
> > 
> > What do you think?
> > 
> 
> I'm not really that worried about consistency, just trying to
> understand
> what exactly is going on in this patch in order to convince myself
> that
> this is a complete fix.
> 

The idea with this fix is to help next patch with does the alignment
for dvec4s. Without this fix, I see an XYZW swizzle in all uniforms, so
I don't know who is a genuinely dvec4 and who is not. On top of that,
without this patch, pack_uniform_registers() would be wrong for DF
scalar and dvec2 as it would mark 2 vec4 as used, when it's actually
one, because it sees swizzle and writemask of XYWZ for both.

While writing this, I found an error in this patch. I focused on
setting the swizzle but actually we need to set both the swizzle and
the writemask so we don't iterate more than needed when setting the
used channels in pack_uniform_registers(). However I don't think we
need to change anything for the indirect case, as in MOV INDIRECT the
writemask is explicitly marked as XYZW because we write the whole
thing... then we don't need to change anything (nor swizzle nor
writemask).

What do you think?

I am going to do more tests and try some ideas I have now. I will
contact you afterwards. I think we should discuss this privately (email
or VoIP/IRC) to avoid adding more noise to the mailing list.

Sam

> > > > and the comment in pack_uniform_registers():
> > > > 
> > > > /* We just mark every register touched by a MOV_INDIRECT as
> > > > being
> > > >  * fully used.  This ensures that it doesn't broken up
> > > > piecewise by
> > > >  * the next part of our packing algorithm.
> > > >  */
> > > > 
> > > 
> > > Not sure why this is relevant to the discussion?  The same will
> > > be
> > > the
> > > case if you set the same swizzle as for the direct move.
> > > 
> > 
> > Right, I was wrong. Sorry for the noise,
> > 
> > Sam
> > 
> > > > Sam
> > > > 
> > > > > >           src.swizzle += BRW_SWIZZLE4(shift, shift, shift,
> > > > > > shift);
> > > > > >  
> > > > > >           emit(MOV(dest, src));
> > > > > > -- 
> > > > > > 2.11.0
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: This is a digitally signed message part
URL: <https://lists.freedesktop.org/archives/mesa-dev/attachments/20170425/29c56493/attachment.sig>


More information about the mesa-dev mailing list