[Mesa-dev] [PATCH 15/19] i965/fs: Support register coalescing on LOAD_PAYLOAD operands.

Matt Turner mattst88 at gmail.com
Tue Jun 10 16:27:59 PDT 2014


On Sun, Jun 1, 2014 at 12:01 AM, Kenneth Graunke <kenneth at whitecape.org> wrote:
> On Friday, April 18, 2014 11:56:51 AM Matt Turner wrote:
>> ---
>>  .../drivers/dri/i965/brw_fs_register_coalesce.cpp  | 59
> ++++++++++++++++++----
>>  1 file changed, 49 insertions(+), 10 deletions(-)
>>
>> diff --git a/src/mesa/drivers/dri/i965/brw_fs_register_coalesce.cpp
> b/src/mesa/drivers/dri/i965/brw_fs_register_coalesce.cpp
>> index cb4ab57..a363c4c 100644
>> --- a/src/mesa/drivers/dri/i965/brw_fs_register_coalesce.cpp
>> +++ b/src/mesa/drivers/dri/i965/brw_fs_register_coalesce.cpp
>> @@ -46,7 +46,14 @@
>>  static bool
>>  is_nop_mov(const fs_inst *inst)
>>  {
>> -   if (inst->opcode == BRW_OPCODE_MOV) {
>> +   if (inst->opcode == SHADER_OPCODE_LOAD_PAYLOAD) {
>> +      for (int i = 0; i < inst->sources; i++) {
>> +         if (!inst->dst.equals(inst->src[i])) {
>
> I don't think this is what you want.  You're not altering dst.reg_offset,
> here, so dst.reg_offset will presumably always be 0.
>
> For dst.equals(inst->src[i]) to be true for every source, then
>
>    dst.reg == src[i].reg for all sources.
>    dst.reg_offset == src[i].reg_offset for all sources.
>
> which means src[i].reg_offset == 0 for all sources.

Right, will fix.

> In other words, I think this detects LOAD_PAYLOAD operations that splats the
> first component of a size > 1 VGRF dst to all components.
>
> That's definitely not a nop mov.
>
> I think if you add dst.reg_offset++ on each iteration of the loop, you'll be
> okay here.
>
>> +            return false;
>> +         }
>> +      }
>> +      return true;
>> +   } else if (inst->opcode == BRW_OPCODE_MOV) {
>>        return inst->dst.equals(inst->src[0]);
>>     }
>>
>> @@ -54,9 +61,26 @@ is_nop_mov(const fs_inst *inst)
>>  }
>>
>>  static bool
>> +is_copy_payload(const fs_inst *inst)
>> +{
>> +   const int reg = inst->src[0].reg;
>> +   if (inst->src[0].reg_offset != 0)
>> +      return false;
>
> This looks okay, for both src[0] == reg_undef and where it's an actual value.
>
>> +
>> +   for (int i = 1; i < inst->sources; i++) {
>> +      if (inst->src[i].reg != reg ||
>> +          inst->src[i].reg_offset != i) {
>> +         return false;
>> +      }
>> +   }
>> +   return true;
>> +}
>> +
>> +static bool
>>  is_coalesce_candidate(const fs_inst *inst, const int *virtual_grf_sizes)
>>  {
>> -   if (inst->opcode != BRW_OPCODE_MOV ||
>> +   if ((inst->opcode != BRW_OPCODE_MOV &&
>> +        inst->opcode != SHADER_OPCODE_LOAD_PAYLOAD) ||
>>         inst->is_partial_write() ||
>>         inst->saturate ||
>>         inst->src[0].file != GRF ||
>> @@ -72,6 +96,12 @@ is_coalesce_candidate(const fs_inst *inst, const int
> *virtual_grf_sizes)
>>         virtual_grf_sizes[inst->dst.reg])
>>        return false;
>>
>> +   if (inst->opcode == SHADER_OPCODE_LOAD_PAYLOAD) {
>> +      if (!is_copy_payload(inst)) {
>> +         return false;
>> +      }
>> +   }
>> +
>>     return true;
>>  }
>>
>> @@ -144,10 +174,18 @@ fs_visitor::register_coalesce()
>>        if (reg_to != inst->dst.reg)
>>           continue;
>>
>> -      const int offset = inst->src[0].reg_offset;
>> -      reg_to_offset[offset] = inst->dst.reg_offset;
>> -      mov[offset] = inst;
>> -      channels_remaining--;
>> +      if (inst->opcode == SHADER_OPCODE_LOAD_PAYLOAD) {
>> +         for (int i = 0; i < src_size; i++) {
>
> Isn't src_size always 1 for LOAD_PAYLOAD?  Earlier:
>
>          src_size = virtual_grf_sizes[inst->src[0].reg];

No, since inst->src[0].reg is just the number of the virtual register
(doesn't include the offset) and virtual_grf_sizes[] contains virtual
register sizes.

> Presumably you want inst->sources here...

I think you touched on something. is_copy_payload would say "true" to
a load_payload instruction that copied the first three components of a
size 4 vgrf. But here we actually do want the number of sources. Since
returning true for this doesn't seem useful, I'll add a check to
is_copy_payload to return false if the number of sources doesn't match
the size of the vgrf.


More information about the mesa-dev mailing list