<div dir="auto">Loads require barriers, while movs don't. I believe that's why the blob prefers 32-bit loads, and we do too. Why change it?</div> <div class="gmail_quote"><div dir="ltr">On Thu, Jul 5, 2018, 06:21 Rhys Perry <<a href="mailto:pendingchaos02@gmail.com">pendingchaos02@gmail.com</a>> wrote: </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Seems to increase GPR count by a few in some shaders, but also decreases instruction count by a bit. This should only combine them when a mov can not be used (when the load is indirect). total instructions in shared programs : 5804448 -> 5754102 (-0.87%) total gprs used in shared programs : 670065 -> 672540 (0.37%) total shared used in shared programs : 548832 -> 548832 (0.00%) total local used in shared programs : 21068 -> 21068 (0.00%) local shared gpr inst bytes helped 0 0 194 4124 4124 hurt 0 0 1579 97 97 Signed-off-by: Rhys Perry <<a href="mailto:pendingchaos02@gmail.com" target="_blank" rel="noreferrer">pendingchaos02@gmail.com</a>> --- src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp | 5 +++++ src/gallium/drivers/nouveau/codegen/nv50_ir_target_nvc0.cpp | 9 ++------- 2 files changed, 7 insertions(+), 7 deletions(-) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp index 39177bd044..6785082b5a 100644 --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp @@ -2538,6 +2538,11 @@ MemoryOpt::combineLd(Record *rec, Instruction *ld) // for compute indirect loads are not guaranteed to be aligned if (prog->getType() == Program::TYPE_COMPUTE && rec->rel[0]) return false; + // don't combine non-indirect constant loads since OP_LOAD is a + // inefficient way of doing them + if (prog->getTarget()->getChipset() >= NVISA_GM107_CHIPSET && + ld->getSrc(0)->reg.file == FILE_MEMORY_CONST && !ld->src(0).isIndirect(0)) + return false; assert(sizeRc + sizeLd <= 16 && offRc != offLd); diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nvc0.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nvc0.cpp index 7e059235f4..514e1b3723 100644 --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nvc0.cpp +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nvc0.cpp @@ -429,13 +429,8 @@ TargetNVC0::isAccessSupported(DataFile file, DataType ty) const { if (ty == TYPE_NONE) return false; - if (file == FILE_MEMORY_CONST) { - if (getChipset() >= NVISA_GM107_CHIPSET) - return typeSizeof(ty) <= 4; - else - if (getChipset() >= NVISA_GK104_CHIPSET) // wrong encoding ? - return typeSizeof(ty) <= 8; - } + if (file == FILE_MEMORY_CONST && getChipset() >= NVISA_GK104_CHIPSET) // wrong encoding ? + return typeSizeof(ty) <= 8; if (ty == TYPE_B96) return false; return true; -- 2.14.4 </blockquote></div>