<p dir="ltr">There's a less hacky and more hacky way forward. The more hacky solution is to set file index to -1 or something and then not do the lowering when you see that.</p>
<p dir="ltr">The less hacky solution is the one you proposed as #1 - introduce a new file for "buffer" memory and lower it to the global file by adding a base offset.</p>
<p dir="ltr">Right now the meaning of global is overloaded - before lowering it implicitly includes the buffer vase address, and after lowering, it explicitly includes it. Splitting it out I to another file type seems like the cleaner way forward, not sure what issue you were seeing with that approach. (I didn't understand your argument about potential future issues.) What I really don't want is to somehow differentiate glsl-sourced and opencl-sourced compute programs in the backend.</p>
<div class="gmail_quote">On Mar 14, 2016 6:22 AM, "Hans de Goede" <<a href="mailto:hdegoede@redhat.com">hdegoede@redhat.com</a>> wrote:<br type="attribution"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">This little "hack" fixes the use of OpenCL global memory buffers with<br>
nouveau, but clearly the #if 0 is not a solution as it breaks buffers<br>
with GLSL.<br>
<br>
The reason I'm posting this as an RFC patch is to discuss how to solve<br>
this properly, 2 solutions come to mind:<br>
<br>
1) Use separate nv50_ir::FILE_MEMORY_xxx values for buffers versus<br>
TGSI_FILE_MEMORY with TGSI_MEMORY_TYPE_GLOBAL, looking at translateFile()<br>
we currently have:<br>
<br>
case TGSI_FILE_BUFFER: return nv50_ir::FILE_MEMORY_GLOBAL;<br>
case TGSI_FILE_MEMORY: return nv50_ir::FILE_MEMORY_GLOBAL;<br>
<br>
So doing a s/nv50_ir::FILE_MEMORY_GLOBAL/nv50_ir::FILE_MEMORY_BUFFER/<br>
everywhere and then adding a new FILE_MEMORY_GLOBAL seems like an<br>
obvious fix.<br>
<br>
But I'm afraid that we will have similar issues with OpenCL using<br>
flat addresses where as GLSL will have some implied base-address /<br>
offset in other places too, which brings me to solution 2:<br>
<br>
2) Add a flag to Program to indicate that it is an OpenCL compute kernel;<br>
or possible use a different Program::TYPE_* for OpenCL ?<br>
<br>
I've a feeling that this is what we want since the addressing models<br>
are just different and we likely will need to implement different behavior<br>
in various places based on this.<br>
<br>
This will also allow us to use INPUT and CONST in tgsi code build from<br>
OpenCL programs and use that flag to do the right thing, rather then<br>
introducing new MEMORY[x], INPUT resp. MEMORY[x], CONST declarations<br>
for this.<br>
<br>
I'm esp. worried that once GLSL gets global support it will want<br>
different behavior for TGSI_FILE_MEMORY with TGSI_MEMORY_TYPE_GLOBAL<br>
then OpenCL, just like things are now with buffers, rendering solution<br>
1. a non solution<br>
<br>
So I'm seeking input on how to move forward with this ... ?<br>
<br>
Signed-off-by: Hans de Goede <<a href="mailto:hdegoede@redhat.com">hdegoede@redhat.com</a>><br>
---<br>
src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp | 4 ++++<br>
src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp | 2 ++<br>
2 files changed, 6 insertions(+)<br>
<br>
diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp<br>
index de0c72b..15012ac 100644<br>
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp<br>
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp<br>
@@ -1525,6 +1525,10 @@ Converter::makeSym(uint tgsiFile, int fileIdx, int idx, int c, uint32_t address)<br>
<br>
if (tgsiFile == TGSI_FILE_MEMORY) {<br>
switch (code->memoryFiles[fileIdx].mem_type) {<br>
+ case TGSI_MEMORY_TYPE_GLOBAL:<br>
+ /* No-op this is the default for TGSI_FILE_MEMORY */<br>
+ sym->setFile(FILE_MEMORY_GLOBAL);<br>
+ break;<br>
case TGSI_MEMORY_TYPE_SHARED:<br>
sym->setFile(FILE_MEMORY_SHARED);<br>
break;<br>
diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp<br>
index 6cb4dd4..bcc96de 100644<br>
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp<br>
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp<br>
@@ -2106,6 +2106,7 @@ NVC0LoweringPass::visit(Instruction *i)<br>
} else if (i->src(0).getFile() == FILE_SHADER_OUTPUT) {<br>
assert(prog->getType() == Program::TYPE_TESSELLATION_CONTROL);<br>
i->op = OP_VFETCH;<br>
+#if 0<br>
} else if (i->src(0).getFile() == FILE_MEMORY_GLOBAL) {<br>
Value *ind = i->getIndirect(0, 1);<br>
Value *ptr = loadResInfo64(ind, i->getSrc(0)->reg.fileIndex * 16);<br>
@@ -2126,6 +2127,7 @@ NVC0LoweringPass::visit(Instruction *i)<br>
if (i->defExists(0)) {<br>
bld.mkMov(i->getDef(0), bld.mkImm(0));<br>
}<br>
+#endif<br>
}<br>
break;<br>
case OP_ATOM:<br>
--<br>
2.7.2<br>
<br>
</blockquote></div>