[Nouveau] Some llvm questions (for tgsi backend)
Ilia Mirkin
imirkin at alum.mit.edu
Mon Jan 11 09:04:40 PST 2016
On Mon, Jan 11, 2016 at 6:07 AM, Hans de Goede <hdegoede at redhat.com> wrote:
> Hi,
>
> After a few distractions I'm back to work on the llvm tgsi backend. I've
> added clang integration and I can now compile a simple opencl program
> to something which sort of looks like tgsi.
>
> You can find my latest work on this here:
> http://cgit.freedesktop.org/~jwrdegoede/llvm
> http://cgit.freedesktop.org/~jwrdegoede/clang
> (the latter may still need to sync)
>
> I've a little test program of which I have 3 versions now,
> 1 raw gallium calls + a tgsi kernel
> 2 opencl calls to clover + a tgsi kernel
> 3 opencl calls to clover + an opencl kernel
>
> 1 and 2 have been tested on a kepler card, 3 has been
> tested with pocl. My goal for this week is to get
> the tgsi backend to produce code which I can copy
> and paste into 2 and then have it working on a kepler card.
>
> The test program looks like this:
>
> __kernel void test_kern(__global uint *vals, __global uint *buf)
> {
> uint id = get_global_id(0);
>
> buf[32 * id] -= vals[id];
> }
>
> The llvm ir looks like this:
>
> bin/clang -x cl -c -emit-llvm -target tgsi-- -include
> /usr/share/pocl/include/_kernel.h -o ~/foo.ir -x cl -S ~/foo.cl
>
> ; ModuleID = '/home/hans/foo.cl'
> target datalayout = "E-p:32:32-i64:64:64-f32:32:32-n32"
> target triple = "tgsi--"
>
> ; Function Attrs: nounwind
> define void @test_kern(i32 addrspace(1)* nocapture readonly %vals, i32
> addrspace(1)* nocapture %buf) #0 {
> entry:
> %call = tail call i32 @_Z13get_global_idj(i32 0) #2
> %arrayidx = getelementptr inbounds i32, i32 addrspace(1)* %vals, i32 %call
> %0 = load i32, i32 addrspace(1)* %arrayidx, align 4, !tbaa !7
> %mul = shl i32 %call, 5
> %arrayidx1 = getelementptr inbounds i32, i32 addrspace(1)* %buf, i32 %mul
> %1 = load i32, i32 addrspace(1)* %arrayidx1, align 4, !tbaa !7
> %sub = sub i32 %1, %0
> store i32 %sub, i32 addrspace(1)* %arrayidx1, align 4, !tbaa !7
> ret void
> }
>
> declare i32 @_Z13get_global_idj(i32) #1
>
> attributes #0 = { nounwind "disable-tail-calls"="false"
> "less-precise-fpmad"="false" "no-frame-pointer-elim"="true"
> "no-frame-pointer-elim-non-leaf" "no-infs-fp-math"="false"
> "no-nans-fp-math"="false" "stack-protector-buffer-size"="8"
> "unsafe-fp-math"="false" "use-soft-float"="false" }
> attributes #1 = { "disable-tail-calls"="false" "less-precise-fpmad"="false"
> "no-frame-pointer-elim"="true" "no-frame-pointer-elim-non-leaf"
> "no-infs-fp-math"="false" "no-nans-fp-math"="false"
> "stack-protector-buffer-size"="8" "unsafe-fp-math"="false"
> "use-soft-float"="false" }
> attributes #2 = { nounwind }
>
> !opencl.kernels = !{!0}
> !llvm.ident = !{!6}
>
> !0 = !{void (i32 addrspace(1)*, i32 addrspace(1)*)* @test_kern, !1, !2, !3,
> !4, !5}
> !1 = !{!"kernel_arg_addr_space", i32 1, i32 1}
> !2 = !{!"kernel_arg_access_qual", !"none", !"none"}
> !3 = !{!"kernel_arg_type", !"uint*", !"uint*"}
> !4 = !{!"kernel_arg_base_type", !"uint*", !"uint*"}
> !5 = !{!"kernel_arg_type_qual", !"", !""}
> !6 = !{!"clang version 3.8.0 (http://llvm.org/git/clang.git
> 9376f992e00569bd08a4ecf3a1d06d8b93c97681) (http://llvm.org/git/llvm.git
> 7a311143550c6fc01aa5000049825ecc09787440)"}
> !7 = !{!8, !8, i64 0}
> !8 = !{!"int", !9, i64 0}
> !9 = !{!"omnipotent char", !10, i64 0}
> !10 = !{!"Simple C/C++ TBAA"}
>
> And the "tgsi" looks like this:
>
> .text
> .file "/home/hans/foo.cl"
> .globl test_kern
> test_kern:
> BGNSUB
> MOVis TEMP1x, 0
> CAL _Z13get_global_idj
> SHLs TEMP1y, TEMP1x, 7
> LOADiis TEMP1z, [4]
> UADDs TEMP1y, TEMP1z, TEMP1y
> SHLs TEMP1x, TEMP1x, 2
> LOADiis TEMP1z, [0]
> UADDs TEMP1x, TEMP1z, TEMP1x
> LOADgis TEMP1x, [TEMP1x]
> INEGs TEMP1x, TEMP1x
> LOADgis TEMP1z, [TEMP1y]
> UADDs TEMP1x, TEMP1x, TEMP1z
> STOREgis [TEMP1y], TEMP1x
> RET
> ENDSUB
>
> Working tgsi for this would look like this:
>
> COMP
> DCL SV[0], THREAD_ID[0]
> DCL TEMP[0], LOCAL
> DCL TEMP[1], LOCAL
> IMM UINT32 { 0, 0, 0, 0 }
> IMM UINT32 { 4, 0, 0, 0 }
> IMM UINT32 { 128, 0, 0, 0 }
>
> BGNSUB
> LOAD TEMP[0].xy, RINPUT, IMM[0]
> UMUL TEMP[1].x, SV[0], IMM[1]
> UADD TEMP[0].x, TEMP[0], TEMP[1]
> UMUL TEMP[1].x, SV[0], IMM[2]
> UADD TEMP[0].y, TEMP[0], TEMP[1].xxxx
> LOAD TEMP[1].x, RGLOBAL, TEMP[0]
> LOAD TEMP[0].x, RGLOBAL, TEMP[0].yyyy
> UADD TEMP[1].x, TEMP[0], -TEMP[1]
> STORE RGLOBAL.x, TEMP[0].yyyy, TEMP[1]
> RET
> ENDSUB;
>
> So my questions (I'm still quite green when it comes to llvm):
>
> 1) As you can see a proper tgsi program needs a header
> to declare which registers (etc) it is using, in which
> class-method should I implement this ?
>
> 2) Immediates need to be declared with a specific
> value and then addressed as IMM[x], how would I go about
> this ?
>
> 3) The get_global_id call needs to be translated into
> simply using the SV[0] "register", how would I go about
> this ?
>
> 4) The global and input load / stores are not handled
> correctly, I see that the LOAD instructions get postfixed
> with a i reps. g for input / global how would I go about
> modifying the code emitter (AsmPrinter?) to change "LOADi"
> into "LOAD <dest> RINPUT <offset>"?
>
> 5) Talking about the lowecase suffixes to the instructions,
> these should not be part of the output, how do I filter these?
>
> 6) And finally, the current llvm-tgsi output uses e.g.
> TEMP1y where as for the destination it should use TEMP[1].y
> and for the sources it should use TEMP[1].xxxx (so include
> proper swizzling info).
>
> Lots of questions, sorry about that. Feel free to point me
> to some relvant parts of the docs, I've tried to find answers
> myself but I've gotten a bit lost in the docs.
You may consider emitting binary TGSI. It's a semi-fluid format, but
doesn't change too often (and usually does so in backwards-compatible
ways).
BTW, note that I recently got rid of TGSI_FILE_RESOURCE in favor of
BUFFER and IMAGE register files (which in turn correlate to
->set_shader_buffers and ->set_shader_images). We need this for the
various GL extensions (ssbo, atomic, images). Not sure how that
integrates with what OpenCL needs.
-ilia
More information about the Nouveau
mailing list