[Nouveau] [RFC mesa] nouveau: Add support for OpenCL global memory buffers
Samuel Pitoiset
samuel.pitoiset at gmail.com
Mon Mar 14 15:41:22 UTC 2016
On 03/14/2016 04:28 PM, Hans de Goede wrote:
> Hi,
>
> On 14-03-16 16:05, Ilia Mirkin wrote:
>> There's a less hacky and more hacky way forward. The more hacky
>> solution is
>> to set file index to -1 or something and then not do the lowering when
>> you
>> see that.
>>
>> The less hacky solution is the one you proposed as #1 - introduce a new
>> file for "buffer" memory and lower it to the global file by adding a base
>> offset.
>>
>> Right now the meaning of global is overloaded - before lowering it
>> implicitly includes the buffer vase address, and after lowering, it
>> explicitly includes it. Splitting it out I to another file type seems
>> like
>> the cleaner way forward, not sure what issue you were seeing with that
>> approach.
>
> Ok.
I agree with you guys, the solution #1 is fine by me.
Btw, do you need someone with commit access to push your previous series
(the tgsi thing)? I can do this for you.
>
> > (I didn't understand your argument about potential future
>> issues.)
>
> There was not much to understand, it is just something I worried about,
> but was not sure if there actually was something to worry about :)
>
> If you feel that solution #1 (which was also my first hunch) is
> the right one then I will go and implement that.
>
>> What I really don't want is to somehow differentiate glsl-sourced
>> and opencl-sourced compute programs in the backend.
>
> Ok, understood.
>
> Regards,
>
> Hans
>
>
>> On Mar 14, 2016 6:22 AM, "Hans de Goede" <hdegoede at redhat.com> wrote:
>>
>>> This little "hack" fixes the use of OpenCL global memory buffers with
>>> nouveau, but clearly the #if 0 is not a solution as it breaks buffers
>>> with GLSL.
>>>
>>> The reason I'm posting this as an RFC patch is to discuss how to solve
>>> this properly, 2 solutions come to mind:
>>>
>>> 1) Use separate nv50_ir::FILE_MEMORY_xxx values for buffers versus
>>> TGSI_FILE_MEMORY with TGSI_MEMORY_TYPE_GLOBAL, looking at
>>> translateFile()
>>> we currently have:
>>>
>>> case TGSI_FILE_BUFFER: return nv50_ir::FILE_MEMORY_GLOBAL;
>>> case TGSI_FILE_MEMORY: return nv50_ir::FILE_MEMORY_GLOBAL;
>>>
>>> So doing a
>>> s/nv50_ir::FILE_MEMORY_GLOBAL/nv50_ir::FILE_MEMORY_BUFFER/
>>> everywhere and then adding a new FILE_MEMORY_GLOBAL seems like an
>>> obvious fix.
>>>
>>> But I'm afraid that we will have similar issues with OpenCL using
>>> flat addresses where as GLSL will have some implied base-address /
>>> offset in other places too, which brings me to solution 2:
>>>
>>> 2) Add a flag to Program to indicate that it is an OpenCL compute
>>> kernel;
>>> or possible use a different Program::TYPE_* for OpenCL ?
>>>
>>> I've a feeling that this is what we want since the addressing models
>>> are just different and we likely will need to implement different
>>> behavior
>>> in various places based on this.
>>>
>>> This will also allow us to use INPUT and CONST in tgsi code build
>>> from
>>> OpenCL programs and use that flag to do the right thing, rather then
>>> introducing new MEMORY[x], INPUT resp. MEMORY[x], CONST declarations
>>> for this.
>>>
>>> I'm esp. worried that once GLSL gets global support it will want
>>> different behavior for TGSI_FILE_MEMORY with TGSI_MEMORY_TYPE_GLOBAL
>>> then OpenCL, just like things are now with buffers, rendering
>>> solution
>>> 1. a non solution
>>>
>>> So I'm seeking input on how to move forward with this ... ?
>>>
>>> Signed-off-by: Hans de Goede <hdegoede at redhat.com>
>>> ---
>>> src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp | 4 ++++
>>> src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp | 2 ++
>>> 2 files changed, 6 insertions(+)
>>>
>>> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp
>>> b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp
>>> index de0c72b..15012ac 100644
>>> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp
>>> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp
>>> @@ -1525,6 +1525,10 @@ Converter::makeSym(uint tgsiFile, int fileIdx,
>>> int
>>> idx, int c, uint32_t address)
>>>
>>> if (tgsiFile == TGSI_FILE_MEMORY) {
>>> switch (code->memoryFiles[fileIdx].mem_type) {
>>> + case TGSI_MEMORY_TYPE_GLOBAL:
>>> + /* No-op this is the default for TGSI_FILE_MEMORY */
>>> + sym->setFile(FILE_MEMORY_GLOBAL);
>>> + break;
>>> case TGSI_MEMORY_TYPE_SHARED:
>>> sym->setFile(FILE_MEMORY_SHARED);
>>> break;
>>> diff --git
>>> a/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp
>>> b/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp
>>> index 6cb4dd4..bcc96de 100644
>>> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp
>>> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp
>>> @@ -2106,6 +2106,7 @@ NVC0LoweringPass::visit(Instruction *i)
>>> } else if (i->src(0).getFile() == FILE_SHADER_OUTPUT) {
>>> assert(prog->getType() ==
>>> Program::TYPE_TESSELLATION_CONTROL);
>>> i->op = OP_VFETCH;
>>> +#if 0
>>> } else if (i->src(0).getFile() == FILE_MEMORY_GLOBAL) {
>>> Value *ind = i->getIndirect(0, 1);
>>> Value *ptr = loadResInfo64(ind, i->getSrc(0)->reg.fileIndex *
>>> 16);
>>> @@ -2126,6 +2127,7 @@ NVC0LoweringPass::visit(Instruction *i)
>>> if (i->defExists(0)) {
>>> bld.mkMov(i->getDef(0), bld.mkImm(0));
>>> }
>>> +#endif
>>> }
>>> break;
>>> case OP_ATOM:
>>> --
>>> 2.7.2
>>>
>>>
>>
--
-Samuel
More information about the Nouveau
mailing list