[Mesa-dev] [PATCH 3/5] i965/vec4: Use swizzle() to swizzle immediates during constant propagation.

Francisco Jerez currojerez at riseup.net
Tue Mar 1 00:27:03 UTC 2016


Matt Turner <mattst88 at gmail.com> writes:

> On Mon, Feb 29, 2016 at 2:20 PM, Francisco Jerez <currojerez at riseup.net> wrote:
>> Matt Turner <mattst88 at gmail.com> writes:
>>
>>> On Mon, Feb 29, 2016 at 7:30 AM, Iago Toral <itoral at igalia.com> wrote:
>>>> On Fri, 2016-02-26 at 22:02 -0800, Francisco Jerez wrote:
>>>>> swizzle() works for vector immediates other than VF.
>>>>> ---
>>>>>  .../drivers/dri/i965/brw_vec4_copy_propagation.cpp    | 19 +------------------
>>>>>  1 file changed, 1 insertion(+), 18 deletions(-)
>>>>>
>>>>> diff --git a/src/mesa/drivers/dri/i965/brw_vec4_copy_propagation.cpp b/src/mesa/drivers/dri/i965/brw_vec4_copy_propagation.cpp
>>>>> index 6bd9928..5c25164 100644
>>>>> --- a/src/mesa/drivers/dri/i965/brw_vec4_copy_propagation.cpp
>>>>> +++ b/src/mesa/drivers/dri/i965/brw_vec4_copy_propagation.cpp
>>>>> @@ -76,22 +76,6 @@ is_channel_updated(vec4_instruction *inst, src_reg *values[4], int ch)
>>>>>             inst->dst.writemask & (1 << BRW_GET_SWZ(src->swizzle, ch)));
>>>>>  }
>>>>>
>>>>> -static unsigned
>>>>> -swizzle_vf_imm(unsigned vf4, unsigned swizzle)
>>>>> -{
>>>>> -   union {
>>>>> -      unsigned vf4;
>>>>> -      uint8_t vf[4];
>>>>> -   } v = { vf4 }, ret;
>>>>> -
>>>>> -   ret.vf[0] = v.vf[BRW_GET_SWZ(swizzle, 0)];
>>>>> -   ret.vf[1] = v.vf[BRW_GET_SWZ(swizzle, 1)];
>>>>> -   ret.vf[2] = v.vf[BRW_GET_SWZ(swizzle, 2)];
>>>>> -   ret.vf[3] = v.vf[BRW_GET_SWZ(swizzle, 3)];
>>>>> -
>>>>> -   return ret.vf4;
>>>>> -}
>>>>> -
>>>>>  static bool
>>>>>  is_logic_op(enum opcode opcode)
>>>>>  {
>>>>> @@ -144,8 +128,7 @@ try_constant_propagate(const struct brw_device_info *devinfo,
>>>>>        }
>>>>>     }
>>>>>
>>>>> -   if (value.type == BRW_REGISTER_TYPE_VF)
>>>>> -      value.ud = swizzle_vf_imm(value.ud, inst->src[arg].swizzle);
>>>>> +   value = swizzle(value, inst->src[arg].swizzle);
>>>>
>>>> so I guess V/UV was broken before this?
>>>
>>> My question is in the same vein -- what does swizzling V/UV even mean?
>>> Swizzles operate on 4-component things, but V/UV have 8 components.
>>>
>> swizzle() operates on 8-component things 99% of the time.  Its semantics
>> are determined by the equivalence of:
>
> Swizzle operates on vec4s. The fact that SIMD8 instructions operate on
> 2x vec4s doesn't matter as far as I can tell.
>
> This seems analogous to SSE instructions that operate on the top and
> bottom halves of a 128-bit register independently. Yes, the
> instruction operates on 128-bits, but the operation treats the halves
> completely independently.
>
> But all of this is beside the point. I really can't believe we have
> these sorts of disagreements.
>
The defining property of swizzle() that I pointed out in my last e-mail
doesn't impose any restriction on how many components the argument has
(and it's in practice hardly ever a vec4).  This patch simply makes the
implementation of swizzle() match its defining property for all inputs.

Where's the disagreement?  Do you disagree with its defining property?
With the hardware's extension of swizzles to vecN?  Or do you disagree
with this specific implementation of the definition?

>>
>>  OP ..., swizzle(x, swz)
>>
>> to:
>>
>>  MOV tmp, x
>>  OP ..., tmp.swz
>>
>> where the MOV has the same execution type and width as the arbitrary
>> instruction OP.  With this in mind there are no restrictions on what x
>> can be, as long as it can be read from, you just do what the hardware
>> does.
>>
>>> Concretely, for an immediate of 0x1234567UV, what does applying a
>>> swizzle of .yxzw do? Does it swap two dwords (each containing two
>>> values) and produce 0x34125678UV? Or does it operate on both halves
>>> individually and produce 0x21346578UV?
>>>
>> At the hardware level a swizzle specifies a permutation of 4 which is
>> cyclically extended to whatever the vector width is.  So assuming your
>> immediate is 0x01234567UV the correct answer is 0x01324576UV.
>
> Sorry, right. I meant 0x01234567UV.
>
>>
>>> Moreover, using one of these immediate types requires a W destination,
>>
>> I don't think that's right.
>
> The IVB PRM (3.3.4 Immediate) says
>
>
> Restriction: When an immediate vector is used in an instruction, the
> destination must be 128-bit aligned with destination horizontal stride
> equivalent to a word for an immediate integer vector (v) and
> equivalent to a dword for an immediate float vector (vf).
>
> So you could use a byte destination with a stride of 2. But my point
> is that you can't use them with a destination type of D/UD.
>
Yeah, I had seen a similar comment in the BSpec, but I suspect the text
is outdated.  There are restrictions on where you can use integer vector
immediates but I believe they relate mainly to the execution data type
of the instruction, which the destination datatype and alignment doesn't
necessarily have an influence on.  Anyway it looks like there are cases
where one could legitimately use V in Align16 mode, let's discuss more
about the restrictions of vector immediates in the office.

>>
>>> which isn't possible in align16 instructions -- so swizzling V/UV
>>> seems pointless.
>>
>> We use V *and* a W destination in Align16 mode in
>> gen6_gs_visitor.cpp:617.  Admittedly the instruction looks rather weird
>> and I considered replacing it with something else, but there's no harm
>
> Oh, weird. I definitely was not aware of that.
>
>> in swizzle() handling its whole domain consistently, and I didn't feel
>> comfortable about leaving this one case unimplemented and letting the
>> program crash whenever a V/UV immediate comes up in some optimization
>> pass that happens to use swizzle(), which may happen as long as
>> something in the vec4 back-end continues using V/UV.
>
> That's fair given that there is an actual use in the vec4 backend.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 212 bytes
Desc: not available
URL: <https://lists.freedesktop.org/archives/mesa-dev/attachments/20160229/8ef43382/attachment.sig>


More information about the mesa-dev mailing list