[Mesa-dev] [PATCH] intel/eu: Don't apply chansel when repctrl is set

Sagar Ghuge sagar.ghuge at intel.com
Mon Oct 22 20:33:13 UTC 2018



On 10/22/18 10:14 AM, Matt Turner wrote:
> On Tue, Oct 16, 2018 at 4:57 PM Sagar Ghuge <sagar.ghuge at intel.com> wrote:
>>
> 
> Let's describe a little of why we're doing this and how we found it.
> If I recall correctly, we set a NOP (XYZW) swizzle on 3-src
> instructions, except we set an XXXX swizzle on LRP. Is that correct?
> 

Yes. I will resend patch with full description. 

>> Signed-off-by: Sagar Ghuge <sagar.ghuge at intel.com>
>> ---
>>  src/intel/compiler/brw_eu_emit.c | 36 ++++++++++++++++++++++++--------
>>  1 file changed, 27 insertions(+), 9 deletions(-)
>>
>> diff --git a/src/intel/compiler/brw_eu_emit.c b/src/intel/compiler/brw_eu_emit.c
>> index 0cbc682ebc..a6b45fcb1a 100644
>> --- a/src/intel/compiler/brw_eu_emit.c
>> +++ b/src/intel/compiler/brw_eu_emit.c
>> @@ -765,31 +765,49 @@ brw_alu3(struct brw_codegen *p, unsigned opcode, struct brw_reg dest,
>>        brw_inst_set_3src_a16_dst_writemask(devinfo, inst, dest.writemask);
>>
>>        assert(src0.file == BRW_GENERAL_REGISTER_FILE);
>> -      brw_inst_set_3src_a16_src0_swizzle(devinfo, inst, src0.swizzle);
>>        brw_inst_set_3src_a16_src0_subreg_nr(devinfo, inst, get_3src_subreg_nr(src0));
>>        brw_inst_set_3src_src0_reg_nr(devinfo, inst, src0.nr);
>>        brw_inst_set_3src_src0_abs(devinfo, inst, src0.abs);
>>        brw_inst_set_3src_src0_negate(devinfo, inst, src0.negate);
>> -      brw_inst_set_3src_a16_src0_rep_ctrl(devinfo, inst,
>> -                                          src0.vstride == BRW_VERTICAL_STRIDE_0);
>> +
>> +      /* RepCtrl field in Source or Destination Operand table in BDW Bspec
>> +       * says:
>> +       *    "ChanSel does not apply when Replicate Control is set."
>> +       */
>> +      if (src0.vstride == BRW_VERTICAL_STRIDE_0)
>> +         brw_inst_set_3src_a16_src0_rep_ctrl(devinfo, inst, true);
>> +      else
>> +         brw_inst_set_3src_a16_src0_swizzle(devinfo, inst, src0.swizzle);
> 
> I see that the brw_vec1_reg() function uses an XXXX swizzle. That
> indeed determines what swizzle scalar sources (i.e., those with
> rep_ctrl set) will have.
> 
> I would rather we always set the swizzle (so as to not complicate the
> code here) and instead figure out why the brw_reg's swizzle field is
> sometimes XYZW fix that.
> 
> I modified brw_disasm.c locally to always print the swizzle on 3-src
> operands and using the piglit tests
> 
> generated_tests/spec/glsl-1.10/execution/built-in-functions/fs-mix-vec2-vec2-vec2.shader_test
> (for LRP)
> tests/spec/arb_gpu_shader5/execution/built-in-functions/fs-fma.shader_test
> (for MAD)
> 
> I cannot confirm that the swizzles are different between MAD and LRP.
> What am I missing?
> 
> lrp(8)          g4<1>F          g2.4<0,1,0>.xF  g2.2<0,1,0>.xF
> g2.0<0,1,0>.xF { align16 1Q };
> mad(8)          g4<1>F          g3.0<0,1,0>.xF  g2.4<0,1,0>.xF
> g2.0<0,1,0>.xF { align16 1Q };
> 
My bad, swizzles are not different (I will investigate more with different test), 
but while assembling we set both the rep_ctrl 
and the swizzle bits without checking region is scalar or not, 

https://gitlab.freedesktop.org/mesa/mesa/blob/master/src/intel/compiler/brw_eu_emit.c#L768

and while disassembling we check that if the region is scalar g1<0,1,0> then we don't 
disassemble the swizzle bits. For src0/src1/src2 in 3src operand,

https://gitlab.freedesktop.org/mesa/mesa/blob/master/src/intel/compiler/brw_disasm.c#L1103

That's where we have different behavior for assembling and disassembling the instruction. 

BDW+ Bspec says RepCtrl : "This field is only present in three-source instructions, 
for each of the three source operands. It controls replication of the starting channel
to all channels in the execution size. ChanSel does not apply when Replicate Control is set."


More information about the mesa-dev mailing list