[Mesa-dev] [PATCH] gm107/ir: combine 32-bit constant loads into 64-bit ones

Rhys Perry pendingchaos02 at gmail.com
Thu Jul 5 11:09:40 UTC 2018


movs don't work with indirect loads such as c[r0+5]. This should only
combine indirect loads, since they can't be movs.

On Thu, Jul 5, 2018 at 12:04 PM, Ilia Mirkin <imirkin at alum.mit.edu> wrote:
> Loads require barriers, while movs don't. I believe that's why the blob
> prefers 32-bit loads, and we do too. Why change it?
>
> On Thu, Jul 5, 2018, 06:21 Rhys Perry <pendingchaos02 at gmail.com> wrote:
>>
>> Seems to increase GPR count by a few in some shaders, but also decreases
>> instruction count by a bit.
>>
>> This should only combine them when a mov can not be used (when the load
>> is indirect).
>>
>> total instructions in shared programs : 5804448 -> 5754102 (-0.87%)
>> total gprs used in shared programs    : 670065 -> 672540 (0.37%)
>> total shared used in shared programs  : 548832 -> 548832 (0.00%)
>> total local used in shared programs   : 21068 -> 21068 (0.00%)
>>
>>                 local     shared        gpr       inst      bytes
>>     helped           0           0         194        4124        4124
>>       hurt           0           0        1579          97          97
>>
>> Signed-off-by: Rhys Perry <pendingchaos02 at gmail.com>
>> ---
>>  src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp    | 5 +++++
>>  src/gallium/drivers/nouveau/codegen/nv50_ir_target_nvc0.cpp | 9 ++-------
>>  2 files changed, 7 insertions(+), 7 deletions(-)
>>
>> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
>> b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
>> index 39177bd044..6785082b5a 100644
>> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
>> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
>> @@ -2538,6 +2538,11 @@ MemoryOpt::combineLd(Record *rec, Instruction *ld)
>>     // for compute indirect loads are not guaranteed to be aligned
>>     if (prog->getType() == Program::TYPE_COMPUTE && rec->rel[0])
>>        return false;
>> +   // don't combine non-indirect constant loads since OP_LOAD is a
>> +   // inefficient way of doing them
>> +   if (prog->getTarget()->getChipset() >= NVISA_GM107_CHIPSET &&
>> +       ld->getSrc(0)->reg.file == FILE_MEMORY_CONST &&
>> !ld->src(0).isIndirect(0))
>> +      return false;
>>
>>     assert(sizeRc + sizeLd <= 16 && offRc != offLd);
>>
>> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nvc0.cpp
>> b/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nvc0.cpp
>> index 7e059235f4..514e1b3723 100644
>> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nvc0.cpp
>> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nvc0.cpp
>> @@ -429,13 +429,8 @@ TargetNVC0::isAccessSupported(DataFile file, DataType
>> ty) const
>>  {
>>     if (ty == TYPE_NONE)
>>        return false;
>> -   if (file == FILE_MEMORY_CONST) {
>> -      if (getChipset() >= NVISA_GM107_CHIPSET)
>> -         return typeSizeof(ty) <= 4;
>> -      else
>> -      if (getChipset() >= NVISA_GK104_CHIPSET) // wrong encoding ?
>> -         return typeSizeof(ty) <= 8;
>> -   }
>> +   if (file == FILE_MEMORY_CONST && getChipset() >= NVISA_GK104_CHIPSET)
>> // wrong encoding ?
>> +      return typeSizeof(ty) <= 8;
>>     if (ty == TYPE_B96)
>>        return false;
>>     return true;
>> --
>> 2.14.4
>>
>


More information about the mesa-dev mailing list