[Mesa-dev] [PATCH 39/78] i965/nir/vec4: Add swizzle utility method for vector ops

Jason Ekstrand jason at jlekstrand.net
Sun Jul 5 19:05:13 PDT 2015

On Fri, Jul 3, 2015 at 10:23 AM, Antía Puentes <apuentes at igalia.com> wrote:
> Hi Jason,
> On mar, 2015-06-30 at 14:18 -0700, Jason Ekstrand wrote:
>> On Fri, Jun 26, 2015 at 1:06 AM, Eduardo Lima Mitev <elima at igalia.com> wrote:
>> > From: Antia Puentes <apuentes at igalia.com>
>> >
>> > For operations that have a predefined operand size > 0, defined in
>> > glsl/nir/nir_opcodes.c, NIR returns a swizzle containing zeros in the
>> > components from outside the source vector. However, the driver
>> > expects those components to have a swizzle value equal to the swizzle
>> > of the component in (number of vector elements - 1). For example,
>> >
>> >    for a vec2 operation with an identity swizzle (.xy):
>> >       - (NIR -> XYXX, driver ->XYYYY)
>> >
>> >    for a vec3 operation with swizzle (.zxy)
>> >       - (NIR-> ZXYX, driver -> ZXYY)
>> Why is this needed.  Was there some bug you ran into regarding this or
>> are you just trying to make the generated code match?  If the
>> component isn't used, then it seems to me like the swizzle shouldn't
>> matter.  NIR may give you bogus swizzles outside of the enabled
>> channels but it may do that regardless.  I'm also not seeing how
>> composing swizzles fixes anything.
> There were several bugs in our NIR->vec4 code path related to NIR
> returning bogus swizzles outside the components of a vector. When doing
> an "any", "equal" or "notEqual" operation on vectors, the result of the
> instruction "cmp" is predicated using "all4h" or "any4h" depending on
> the operation. The results were incorrect for vectors with less than 4
> elements because: (1) I was not setting the writemask of the "cmp"
> operation according to the vector size and (2) NIR returns bogus
> swizzles for components outside the vector. As a consequence, we had
> spurious component-wise comparisons that affected the result of the
> predication.
> As composing the operand swizzle with the identity swizzle-per-size
> means to repeat the value of the swizzle of the last vector-element on
> the components outside the vector, the error mentioned above does not
> happen because to repeat the last (inside-boundaries) component-wise
> comparison for the components outside the vector does not affect the
> result of the predication.
> Other way to fix the bugs above is to set the writemask of the "cmp"
> operation to the appropriate mask according to the vector size, thus
> forcing the "cmp" instruction to compare only the components inside the
> vector boundaries. As the vec4_visitor does not set the writemask of the
> "cmp" instruction because it relies on the operand having the correct
> swizzle outside the vector boundaries, for consistency with the existing
> behaviour, I decided to do the same and compose the swizzle (notice that
> the composing is done in the vec4_visitor when visiting an ir_swizzle,
> while in NIR we only set the swizzle for the elements inside the
> vector). I can remove this function if you prefer setting the writemask
> over composing the swizzles.

I could go either way.  I kind of like the idea of setting the
writemask rather than the swizzle, but I'm fine with either, now that
you've explained it.

If we go with fixing the swizzle, I think I'd like to see a few
changes.  First, I think I'd rather see this as part of the loop at
the top of nir_emit_alu that sets up the sources.  Second, your
implementation always uses src[0] as representative of all fixed-sized
sources; this works now but is not guaranteed.  Third, I think it
needs a better comment. The current comment shows differences between
what NIR gives you in some cases and what the old visitor gives but,
as far as I can tell, doesn't actually demonstrate something that
would be a bug without it.  The explanation you gave above together
with a better example would probably be sufficient.

More information about the mesa-dev mailing list