[Nouveau] Some llvm questions (for tgsi backend)
Tom Stellard
tom at stellard.net
Mon Jan 11 09:10:10 PST 2016
On Mon, Jan 11, 2016 at 12:07:14PM +0100, Hans de Goede wrote:
> Hi,
>
> After a few distractions I'm back to work on the llvm tgsi backend. I've
> added clang integration and I can now compile a simple opencl program
> to something which sort of looks like tgsi.
>
> You can find my latest work on this here:
> http://cgit.freedesktop.org/~jwrdegoede/llvm
> http://cgit.freedesktop.org/~jwrdegoede/clang
> (the latter may still need to sync)
>
> I've a little test program of which I have 3 versions now,
> 1 raw gallium calls + a tgsi kernel
> 2 opencl calls to clover + a tgsi kernel
> 3 opencl calls to clover + an opencl kernel
>
> 1 and 2 have been tested on a kepler card, 3 has been
> tested with pocl. My goal for this week is to get
> the tgsi backend to produce code which I can copy
> and paste into 2 and then have it working on a kepler card.
>
> The test program looks like this:
>
> __kernel void test_kern(__global uint *vals, __global uint *buf)
> {
> uint id = get_global_id(0);
>
> buf[32 * id] -= vals[id];
> }
>
> The llvm ir looks like this:
>
> bin/clang -x cl -c -emit-llvm -target tgsi-- -include /usr/share/pocl/include/_kernel.h -o ~/foo.ir -x cl -S ~/foo.cl
>
> ; ModuleID = '/home/hans/foo.cl'
> target datalayout = "E-p:32:32-i64:64:64-f32:32:32-n32"
> target triple = "tgsi--"
>
> ; Function Attrs: nounwind
> define void @test_kern(i32 addrspace(1)* nocapture readonly %vals, i32 addrspace(1)* nocapture %buf) #0 {
> entry:
> %call = tail call i32 @_Z13get_global_idj(i32 0) #2
> %arrayidx = getelementptr inbounds i32, i32 addrspace(1)* %vals, i32 %call
> %0 = load i32, i32 addrspace(1)* %arrayidx, align 4, !tbaa !7
> %mul = shl i32 %call, 5
> %arrayidx1 = getelementptr inbounds i32, i32 addrspace(1)* %buf, i32 %mul
> %1 = load i32, i32 addrspace(1)* %arrayidx1, align 4, !tbaa !7
> %sub = sub i32 %1, %0
> store i32 %sub, i32 addrspace(1)* %arrayidx1, align 4, !tbaa !7
> ret void
> }
>
> declare i32 @_Z13get_global_idj(i32) #1
>
> attributes #0 = { nounwind "disable-tail-calls"="false" "less-precise-fpmad"="false" "no-frame-pointer-elim"="true" "no-frame-pointer-elim-non-leaf" "no-infs-fp-math"="false" "no-nans-fp-math"="false" "stack-protector-buffer-size"="8" "unsafe-fp-math"="false" "use-soft-float"="false" }
> attributes #1 = { "disable-tail-calls"="false" "less-precise-fpmad"="false" "no-frame-pointer-elim"="true" "no-frame-pointer-elim-non-leaf" "no-infs-fp-math"="false" "no-nans-fp-math"="false" "stack-protector-buffer-size"="8" "unsafe-fp-math"="false" "use-soft-float"="false" }
> attributes #2 = { nounwind }
>
> !opencl.kernels = !{!0}
> !llvm.ident = !{!6}
>
> !0 = !{void (i32 addrspace(1)*, i32 addrspace(1)*)* @test_kern, !1, !2, !3, !4, !5}
> !1 = !{!"kernel_arg_addr_space", i32 1, i32 1}
> !2 = !{!"kernel_arg_access_qual", !"none", !"none"}
> !3 = !{!"kernel_arg_type", !"uint*", !"uint*"}
> !4 = !{!"kernel_arg_base_type", !"uint*", !"uint*"}
> !5 = !{!"kernel_arg_type_qual", !"", !""}
> !6 = !{!"clang version 3.8.0 (http://llvm.org/git/clang.git 9376f992e00569bd08a4ecf3a1d06d8b93c97681) (http://llvm.org/git/llvm.git 7a311143550c6fc01aa5000049825ecc09787440)"}
> !7 = !{!8, !8, i64 0}
> !8 = !{!"int", !9, i64 0}
> !9 = !{!"omnipotent char", !10, i64 0}
> !10 = !{!"Simple C/C++ TBAA"}
>
> And the "tgsi" looks like this:
>
> .text
> .file "/home/hans/foo.cl"
> .globl test_kern
> test_kern:
> BGNSUB
> MOVis TEMP1x, 0
> CAL _Z13get_global_idj
> SHLs TEMP1y, TEMP1x, 7
> LOADiis TEMP1z, [4]
> UADDs TEMP1y, TEMP1z, TEMP1y
> SHLs TEMP1x, TEMP1x, 2
> LOADiis TEMP1z, [0]
> UADDs TEMP1x, TEMP1z, TEMP1x
> LOADgis TEMP1x, [TEMP1x]
> INEGs TEMP1x, TEMP1x
> LOADgis TEMP1z, [TEMP1y]
> UADDs TEMP1x, TEMP1x, TEMP1z
> STOREgis [TEMP1y], TEMP1x
> RET
> ENDSUB
>
> Working tgsi for this would look like this:
>
> COMP
> DCL SV[0], THREAD_ID[0]
> DCL TEMP[0], LOCAL
> DCL TEMP[1], LOCAL
> IMM UINT32 { 0, 0, 0, 0 }
> IMM UINT32 { 4, 0, 0, 0 }
> IMM UINT32 { 128, 0, 0, 0 }
>
> BGNSUB
> LOAD TEMP[0].xy, RINPUT, IMM[0]
> UMUL TEMP[1].x, SV[0], IMM[1]
> UADD TEMP[0].x, TEMP[0], TEMP[1]
> UMUL TEMP[1].x, SV[0], IMM[2]
> UADD TEMP[0].y, TEMP[0], TEMP[1].xxxx
> LOAD TEMP[1].x, RGLOBAL, TEMP[0]
> LOAD TEMP[0].x, RGLOBAL, TEMP[0].yyyy
> UADD TEMP[1].x, TEMP[0], -TEMP[1]
> STORE RGLOBAL.x, TEMP[0].yyyy, TEMP[1]
> RET
> ENDSUB;
>
> So my questions (I'm still quite green when it comes to llvm):
>
> 1) As you can see a proper tgsi program needs a header
> to declare which registers (etc) it is using, in which
> class-method should I implement this ?
>
These kinds of things should be emitted by the TGSI implementation of the
AsmPrinter class: http://llvm.org/docs/doxygen/html/classllvm_1_1AsmPrinter.html
You probably want to use EmitStartOfAsmFile() for the headers.
> 2) Immediates need to be declared with a specific
> value and then addressed as IMM[x], how would I go about
> this ?
>
I would recommend adding a pass that replaced immediates in the code with
IMM file regitser accesses.
> 3) The get_global_id call needs to be translated into
> simply using the SV[0] "register", how would I go about
> this ?
>
get_global_id should be implemented in libclc with tgsi specific intrinsics
that read from the system value registers.
> 4) The global and input load / stores are not handled
> correctly, I see that the LOAD instructions get postfixed
> with a i reps. g for input / global how would I go about
> modifying the code emitter (AsmPrinter?) to change "LOADi"
> into "LOAD <dest> RINPUT <offset>"?
>
I don't understand exactly why you want to change this. Are you trying to
make the assembly string for input loads and global loads look the same?
> 5) Talking about the lowecase suffixes to the instructions,
> these should not be part of the output, how do I filter these?
>
> 6) And finally, the current llvm-tgsi output uses e.g.
> TEMP1y where as for the destination it should use TEMP[1].y
> and for the sources it should use TEMP[1].xxxx (so include
> proper swizzling info).
>
You just need to change how the registers are printed in the
TGSIRegisterInfo.td file which should fix this.
> Lots of questions, sorry about that. Feel free to point me
> to some relvant parts of the docs, I've tried to find answers
> myself but I've gotten a bit lost in the docs.
>
Ask more if you have them.
-Tom
> Regards,
>
> Hans
More information about the Nouveau
mailing list