[Nouveau] [Mesa-dev] llvm TGSI backend (WIP) questions

Wed Nov 18 07:11:56 PST 2015

On Wed, Nov 18, 2015 at 03:53:37PM +0100, Hans de Goede wrote:
> Hi,
> 
> On 13-11-15 19:51, Tom Stellard wrote:
> > On Fri, Nov 13, 2015 at 02:46:52PM +0100, Hans de Goede wrote:
> >> Hi All,
> >>
> >> So as discussed I've started working on a TGSI backend for
> >> llvm to use as a way to get compute going on nouveau (and other gpu-s).
> >>
> >> I'm still learning all the ins and outs of llvm so I do not have
> >> much to show yet.
> >>
> >> I've rebased Francisco's (curro's) latest version on top of llvm
> >> trunk, and added a commit on top to actual get it build with the
> >> latest trunk. So currently I'm at the point where I've just
> >> taken Francisco's code, and made it compile, no more and no less.
> >>
> >> I have a git repo with this work available here:
> >>
> >> http://cgit.freedesktop.org/~jwrdegoede/llvm/
> >>
> >> So the next step would be to test this and see if it actually
> >> does anything, questions:
> >>
> >> 1) Does anyone have a simple test case / command where I can
> >> invoke just llvm and get TGSI asm output to check ?
> >>
> >
> > The easiest way to do this is with the llc tool which ships with llvm.
> > It compiles LLVM IR to target code, which in this case is tgsi.
> > I would recommend taking one of the simple examples from
> > test/CodeGen/AMDGPU (you may need to get these from llvm trunk, not sure
> > what llvm version you are using).
> >
> > To use llc:
> >
> > llc -march=tgsi input.ll -o -
> >
> >
> > This will output TGSI.
> 
> So after some bugfixing to fix a bunch of segfaults I get:
> 
> $ bin/llc -march=tgsi ../test/CodeGen/AMDGPU/add.ll -o -
> 
> # BB#0:
>          UADDs TEMP0x, TEMP0x, 0
>          LOADgis TEMP1z, [TEMP1y]
>          UADDs TEMP1y, TEMP1y, 4
>          LOADgis TEMP1y, [TEMP1y]
>          UADDs TEMP1y, TEMP1z, TEMP1y
>          STOREgis [TEMP1x], TEMP1y
>          UADDs TEMP0x, TEMP0x, 0
>          RET
>          ENDSUB
> 
> and add.ll has:
> 
> ;FUNC-LABEL: {{^}}test1:
> ;EG: ADD_INT {{[* ]*}}T{{[0-9]+\.[XYZW], T[0-9]+\.[XYZW], T[0-9]+\.[XYZW]}}
> 
> ;SI: v_add_i32_e32 [[REG:v[0-9]+]], vcc, {{v[0-9]+, v[0-9]+}}
> ;SI-NOT: [[REG]]
> ;SI: buffer_store_dword [[REG]],
> define void @test1(i32 addrspace(1)* %out, i32 addrspace(1)* %in) {
>    %b_ptr = getelementptr i32, i32 addrspace(1)* %in, i32 1
>    %a = load i32, i32 addrspace(1)* %in
>    %b = load i32, i32 addrspace(1)* %b_ptr
>    %result = add i32 %a, %b
>    store i32 %result, i32 addrspace(1)* %out
>    ret void
> }
> 
> So the generated code for test1 resmbles the input somewhat but is in no way correct,
> e.g. I do not understand why it is assuming that both TEMP0x and TEMP1z contain the
> address of the array with the 2 input integers. Nor do I understand why it is using
> TEMP1z and TEMP1y as sources for the UADD, where it has been doing the LOAD-s to
> TEMP0x and and TEMP1y
> 

The placement of inputs into registers is controlled by the calling
convention, which is implemented in TGSIISelLowering.cpp and file
called probably called something like TGSICallingConv.td.

Maybe I'm reading the assembly wrong, but it looks like values are being
loaded into TEMP1z and TEMP1y not TEMP0x and TEMP1y.

-Tom

> And then we've function test2 in add.ll
> 
> ;FUNC-LABEL: {{^}}test2:
> ;EG: ADD_INT {{[* ]*}}T{{[0-9]+\.[XYZW], T[0-9]+\.[XYZW], T[0-9]+\.[XYZW]}}
> ;EG: ADD_INT {{[* ]*}}T{{[0-9]+\.[XYZW], T[0-9]+\.[XYZW], T[0-9]+\.[XYZW]}}
> 
> ;SI: v_add_i32_e32 v{{[0-9]+, vcc, v[0-9]+, v[0-9]+}}
> ;SI: v_add_i32_e32 v{{[0-9]+, vcc, v[0-9]+, v[0-9]+}}
> 
> define void @test2(<2 x i32> addrspace(1)* %out, <2 x i32> addrspace(1)* %in) {
>    %b_ptr = getelementptr <2 x i32>, <2 x i32> addrspace(1)* %in, i32 1
>    %a = load <2 x i32>, <2 x i32> addrspace(1)* %in
>    %b = load <2 x i32>, <2 x i32> addrspace(1)* %b_ptr
>    %result = add <2 x i32> %a, %b
>    store <2 x i32> %result, <2 x i32> addrspace(1)* %out
>    ret void
> }
> 
> Which completely makes the tgsi backend unhappy:
> 
> LLVM ERROR: Cannot select: t43: i32,ch = load<LD4[FixedStack0](align=16)> t45:1, FrameIndex:i32<0>, undef:i32
> t41: i32 = FrameIndex<0>
> t8: i32 = undef
> In function: test2
> 
> Any hints on where to start looking with fixing these issues would be much
> appreciated.
> 
> Regards,
> 
> Hans