[Mesa-dev] [PATCH v3 13/24] i965/vec4: split DF instructions and later double its execsize in IVB/BYT

Samuel Iglesias Gonsálvez siglesias at igalia.com
Mon Mar 6 12:03:28 UTC 2017



On 04/03/17 00:32, Francisco Jerez wrote:
> Samuel Iglesias Gonsálvez <siglesias at igalia.com> writes:
> 
>> We need to split DF instructions in two on IVB/BYT as it needs an
>> execsize 8 to process 4 DF values (one GRF in total).
>>
>> v2:
>> - Rename helper and make it static inline function (Matt).
>> - Fix indention and add braces (Matt).
>>
>> Signed-off-by: Samuel Iglesias Gonsálvez <siglesias at igalia.com>
>> ---
>>  src/mesa/drivers/dri/i965/brw_ir_vec4.h          | 14 ++++++++++++++
>>  src/mesa/drivers/dri/i965/brw_vec4.cpp           |  7 ++++++-
>>  src/mesa/drivers/dri/i965/brw_vec4_generator.cpp | 15 +++++++++++++--
>>  3 files changed, 33 insertions(+), 3 deletions(-)
>>
>> diff --git a/src/mesa/drivers/dri/i965/brw_ir_vec4.h b/src/mesa/drivers/dri/i965/brw_ir_vec4.h
>> index 57fc6be8f89..9d29c3fb944 100644
>> --- a/src/mesa/drivers/dri/i965/brw_ir_vec4.h
>> +++ b/src/mesa/drivers/dri/i965/brw_ir_vec4.h
>> @@ -405,6 +405,20 @@ regs_read(const vec4_instruction *inst, unsigned i)
>>                         reg_size);
>>  }
>>  
>> +static inline unsigned
>> +get_exec_type_size(const vec4_instruction *inst)
>> +{
>> +   unsigned exec_type_size = 0;
>> +
>> +   for (int i = 0; i < 3; i++) {
>> +      if (inst->src[i].type != BAD_FILE) {
>> +         exec_type_size = MAX2(exec_type_size, type_sz(inst->src[i].type));
>> +      }
>> +   }
>> +
>> +   return exec_type_size;
>> +}
>> +
>>  } /* namespace brw */
>>  
>>  #endif
>> diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp b/src/mesa/drivers/dri/i965/brw_vec4.cpp
>> index 5e60eb657a7..7080c93e550 100644
>> --- a/src/mesa/drivers/dri/i965/brw_vec4.cpp
>> +++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp
>> @@ -2092,6 +2092,10 @@ get_lowered_simd_width(const struct gen_device_info *devinfo,
>>        if (inst->opcode == BRW_OPCODE_SEL && type_sz(inst->dst.type) == 8)
>>           lowered_width = MIN2(lowered_width, 4);
>>  
> 
> Maybe add a short comment here explaining why you need to do this?
> 

OK

>> +      if (devinfo->gen == 7 && !devinfo->is_haswell &&
>> +          (get_exec_type_size(inst) == 8 || type_sz(inst->dst.type) == 8))
>> +         lowered_width = MIN2(lowered_width, 4);
>> +
>>        /* HSW PRM, 3D Media GPGPU Engine, Region Alignment Rules for Direct
>>         * Register Addressing:
>>         *
>> @@ -2199,7 +2203,8 @@ vec4_visitor::lower_simd_width()
>>                 inst->insert_before(block, copy);
>>              }
>>           } else {
>> -            dst = horiz_offset(inst->dst, channel_offset);
>> +            if (inst->dst.file != ARF)
>> +               dst = horiz_offset(inst->dst, channel_offset);
> 
> This doesn't look right, you need to give the same treatment to ARF
> registers as to other registers.  If what you're trying to avoid here is
> shifting the null register incorrectly, I suggest you fix horiz_offset()
> to return the argument unchanged if it's the null register, because the
> null register logically behaves like a scalar register (this is also
> consistent with the way the FS back-end handles the same situation).
> 

OK, thanks.

>>           }
>>           linst->dst = dst;
>>  
>> diff --git a/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp b/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
>> index 847a01bd43c..7bb1ab1879c 100644
>> --- a/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
>> +++ b/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
>> @@ -1522,14 +1522,25 @@ generate_code(struct brw_codegen *p,
>>        brw_set_default_saturate(p, inst->saturate);
>>        brw_set_default_mask_control(p, inst->force_writemask_all);
>>        brw_set_default_acc_write_control(p, inst->writes_accumulator);
>> -      brw_set_default_exec_size(p, cvt(inst->exec_size) - 1);
>>  
>> -      assert(inst->group % inst->exec_size == 0);
>> +      bool is_ivb_df = devinfo->gen == 7 &&
>> +         !devinfo->is_haswell &&
>> +         (get_exec_type_size(inst) == 8 ||
>> +          inst->dst.type == BRW_REGISTER_TYPE_DF);
>> +
>> +      assert(inst->group % inst->exec_size == 0 ||
>> +             is_ivb_df);
>> +
>>        assert(inst->group % 8 == 0 ||
>>               inst->dst.type == BRW_REGISTER_TYPE_DF ||
>>               inst->src[0].type == BRW_REGISTER_TYPE_DF ||
>>               inst->src[1].type == BRW_REGISTER_TYPE_DF ||
>>               inst->src[2].type == BRW_REGISTER_TYPE_DF);
>> +
>> +      if (is_ivb_df && inst->exec_size < 8)
>> +         inst->exec_size *= 2;
>> +      brw_set_default_exec_size(p, cvt(inst->exec_size) - 1);
>> +
> 
> Same comment here as for its FS counterpart...  Please let's not modify
> the IR from the generator.
> 

Right. I am going to fix it.

Sam

>>        if (!inst->force_writemask_all)
>>           brw_set_default_group(p, inst->group);
>>  
>> -- 
>> 2.11.0
>>
>> _______________________________________________
>> mesa-dev mailing list
>> mesa-dev at lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/mesa-dev


-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 862 bytes
Desc: OpenPGP digital signature
URL: <https://lists.freedesktop.org/archives/mesa-dev/attachments/20170306/39a3e258/attachment-0001.sig>


More information about the mesa-dev mailing list