[Mesa-dev] Mesa (master): st/mesa: skip lower_output_reads when possible

Nicolai Hähnle nhaehnle at gmail.com
Thu Dec 1 09:25:38 UTC 2016


Thanks for the heads up, I'm going to look into it. The shader must be 
mis-compiled also in optimized builds, but apparently the test isn't 
strict enough to catch that...

Nicolai

On 01.12.2016 08:01, Michel Dänzer wrote:
> On 30/11/16 05:10 PM, Nicolai Hähnle wrote:
>> Module: Mesa
>> Branch: master
>> Commit: f60374aa689539c8dbdb851488be515e5e7df7cb
>> URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=f60374aa689539c8dbdb851488be515e5e7df7cb
>>
>> Author: Nicolai Hähnle <nicolai.haehnle at amd.com>
>> Date:   Fri Nov 18 20:51:56 2016 +0100
>>
>> st/mesa: skip lower_output_reads when possible
>
> This broke the piglit test spec at glsl-1.50@execution at variable
> indexing at gs-output-array-vec4-index-wr for me on Kaveri and Tonga.
> R600_DEBUG=mono avoids it. Shader dump, assertion failure and backtrace:
>
>
> GEOM
> PROPERTY GS_INPUT_PRIMITIVE TRIANGLES
> PROPERTY GS_OUTPUT_PRIMITIVE TRIANGLE_STRIP
> PROPERTY GS_MAX_OUTPUT_VERTICES 3
> PROPERTY GS_INVOCATIONS 1
> DCL IN[][0], POSITION
> DCL OUT[0], POSITION
> DCL OUT[1..16], ARRAY(1), GENERIC[0]
> DCL OUT[17..32], ARRAY(2), GENERIC[16]
> DCL CONST[0]
> DCL TEMP[0], LOCAL
> DCL ADDR[0]
> IMM[0] FLT32 {    1.0000,     1.1000,     1.2000,     1.3000}
> IMM[1] FLT32 {    2.0000,     2.1000,     2.2000,     2.3000}
> IMM[2] FLT32 {    3.0000,     3.1000,     3.2000,     3.3000}
> IMM[3] FLT32 {    4.0000,     4.1000,     4.2000,     4.3000}
> IMM[4] FLT32 {    5.0000,     5.1000,     5.2000,     5.3000}
> IMM[5] FLT32 {    6.0000,     6.1000,     6.2000,     6.3000}
> IMM[6] FLT32 {    7.0000,     7.1000,     7.2000,     7.3000}
> IMM[7] FLT32 {    8.0000,     8.1000,     8.2000,     8.3000}
> IMM[8] FLT32 {    9.0000,     9.1000,     9.2000,     9.3000}
> IMM[9] FLT32 {   10.0000,    10.1000,    10.2000,    10.3000}
> IMM[10] FLT32 {   11.0000,    11.1000,    11.2000,    11.3000}
> IMM[11] FLT32 {   12.0000,    12.1000,    12.2000,    12.3000}
> IMM[12] FLT32 {   13.0000,    13.1000,    13.2000,    13.3000}
> IMM[13] FLT32 {   14.0000,    14.1000,    14.2000,    14.3000}
> IMM[14] FLT32 {   15.0000,    15.1000,    15.2000,    15.3000}
> IMM[15] FLT32 {   16.0000,    16.1000,    16.2000,    16.3000}
> IMM[16] FLT32 {   17.0000,    17.1000,    17.2000,    17.3000}
> IMM[17] FLT32 {   18.0000,    18.1000,    18.2000,    18.3000}
> IMM[18] FLT32 {   19.0000,    19.1000,    19.2000,    19.3000}
> IMM[19] FLT32 {   20.0000,    20.1000,    20.2000,    20.3000}
> IMM[20] FLT32 {   21.0000,    21.1000,    21.2000,    21.3000}
> IMM[21] FLT32 {   22.0000,    22.1000,    22.2000,    22.3000}
> IMM[22] FLT32 {   23.0000,    23.1000,    23.2000,    23.3000}
> IMM[23] FLT32 {   24.0000,    24.1000,    24.2000,    24.3000}
> IMM[24] FLT32 {   25.0000,    25.1000,    25.2000,    25.3000}
> IMM[25] FLT32 {   26.0000,    26.1000,    26.2000,    26.3000}
> IMM[26] FLT32 {   27.0000,    27.1000,    27.2000,    27.3000}
> IMM[27] FLT32 {   28.0000,    28.1000,    28.2000,    28.3000}
> IMM[28] FLT32 {   29.0000,    29.1000,    29.2000,    29.3000}
> IMM[29] FLT32 {   30.0000,    30.1000,    30.2000,    30.3000}
> IMM[30] FLT32 {   31.0000,    31.1000,    31.2000,    31.3000}
> IMM[31] FLT32 {   32.0000,    32.1000,    32.2000,    32.3000}
> IMM[32] INT32 {16, -16, 0, 0}
> IMM[33] FLT32 {    0.0000,     0.1000,     0.2000,     0.3000}
>   0: MOV OUT[1], IMM[0]
>   1: MOV OUT[2], IMM[1]
>   2: MOV OUT[3], IMM[2]
>   3: MOV OUT[4], IMM[3]
>   4: MOV OUT[5], IMM[4]
>   5: MOV OUT[6], IMM[5]
>   6: MOV OUT[7], IMM[6]
>   7: MOV OUT[8], IMM[7]
>   8: MOV OUT[9], IMM[8]
>   9: MOV OUT[10], IMM[9]
>  10: MOV OUT[11], IMM[10]
>  11: MOV OUT[12], IMM[11]
>  12: MOV OUT[13], IMM[12]
>  13: MOV OUT[14], IMM[13]
>  14: MOV OUT[15], IMM[14]
>  15: MOV OUT[16], IMM[15]
>  16: MOV OUT[17], IMM[16]
>  17: MOV OUT[18], IMM[17]
>  18: MOV OUT[19], IMM[18]
>  19: MOV OUT[20], IMM[19]
>  20: MOV OUT[21], IMM[20]
>  21: MOV OUT[22], IMM[21]
>  22: MOV OUT[23], IMM[22]
>  23: MOV OUT[24], IMM[23]
>  24: MOV OUT[25], IMM[24]
>  25: MOV OUT[26], IMM[25]
>  26: MOV OUT[27], IMM[26]
>  27: MOV OUT[28], IMM[27]
>  28: MOV OUT[29], IMM[28]
>  29: MOV OUT[30], IMM[29]
>  30: MOV OUT[31], IMM[30]
>  31: MOV OUT[32], IMM[31]
>  32: ISGE TEMP[0].x, CONST[0].xxxx, IMM[32].xxxx
>  33: UIF TEMP[0].xxxx :0
>  34:   UADD TEMP[0].x, CONST[0].xxxx, IMM[32].yyyy
>  35:   UARL ADDR[0].x, TEMP[0].xxxx
>  36:   MOV OUT[ADDR[0].x+17](2), IMM[33]
>  37: ELSE :0
>  38:   UARL ADDR[0].x, CONST[0].xxxx
>  39:   MOV OUT[ADDR[0].x+1](1), IMM[33]
>  40: ENDIF
>  41: MOV OUT[0], IN[0][0]
>  42: EMIT IMM[32].zzzz
>  43: MOV OUT[1], IMM[0]
>  44: MOV OUT[2], IMM[1]
>  45: MOV OUT[3], IMM[2]
>  46: MOV OUT[4], IMM[3]
>  47: MOV OUT[5], IMM[4]
>  48: MOV OUT[6], IMM[5]
>  49: MOV OUT[7], IMM[6]
>  50: MOV OUT[8], IMM[7]
>  51: MOV OUT[9], IMM[8]
>  52: MOV OUT[10], IMM[9]
>  53: MOV OUT[11], IMM[10]
>  54: MOV OUT[12], IMM[11]
>  55: MOV OUT[13], IMM[12]
>  56: MOV OUT[14], IMM[13]
>  57: MOV OUT[15], IMM[14]
>  58: MOV OUT[16], IMM[15]
>  59: MOV OUT[17], IMM[16]
>  60: MOV OUT[18], IMM[17]
>  61: MOV OUT[19], IMM[18]
>  62: MOV OUT[20], IMM[19]
>  63: MOV OUT[21], IMM[20]
>  64: MOV OUT[22], IMM[21]
>  65: MOV OUT[23], IMM[22]
>  66: MOV OUT[24], IMM[23]
>  67: MOV OUT[25], IMM[24]
>  68: MOV OUT[26], IMM[25]
>  69: MOV OUT[27], IMM[26]
>  70: MOV OUT[28], IMM[27]
>  71: MOV OUT[29], IMM[28]
>  72: MOV OUT[30], IMM[29]
>  73: MOV OUT[31], IMM[30]
>  74: MOV OUT[32], IMM[31]
>  75: ISGE TEMP[0].x, CONST[0].xxxx, IMM[32].xxxx
>  76: UIF TEMP[0].xxxx :0
>  77:   UADD TEMP[0].x, CONST[0].xxxx, IMM[32].yyyy
>  78:   UARL ADDR[0].x, TEMP[0].xxxx
>  79:   MOV OUT[ADDR[0].x+17](2), IMM[33]
>  80: ELSE :0
>  81:   UARL ADDR[0].x, CONST[0].xxxx
>  82:   MOV OUT[ADDR[0].x+1](1), IMM[33]
>  83: ENDIF
>  84: MOV OUT[0], IN[1][0]
>  85: EMIT IMM[32].zzzz
>  86: MOV OUT[1], IMM[0]
>  87: MOV OUT[2], IMM[1]
>  88: MOV OUT[3], IMM[2]
>  89: MOV OUT[4], IMM[3]
>  90: MOV OUT[5], IMM[4]
>  91: MOV OUT[6], IMM[5]
>  92: MOV OUT[7], IMM[6]
>  93: MOV OUT[8], IMM[7]
>  94: MOV OUT[9], IMM[8]
>  95: MOV OUT[10], IMM[9]
>  96: MOV OUT[11], IMM[10]
>  97: MOV OUT[12], IMM[11]
>  98: MOV OUT[13], IMM[12]
>  99: MOV OUT[14], IMM[13]
> 100: MOV OUT[15], IMM[14]
> 101: MOV OUT[16], IMM[15]
> 102: MOV OUT[17], IMM[16]
> 103: MOV OUT[18], IMM[17]
> 104: MOV OUT[19], IMM[18]
> 105: MOV OUT[20], IMM[19]
> 106: MOV OUT[21], IMM[20]
> 107: MOV OUT[22], IMM[21]
> 108: MOV OUT[23], IMM[22]
> 109: MOV OUT[24], IMM[23]
> 110: MOV OUT[25], IMM[24]
> 111: MOV OUT[26], IMM[25]
> 112: MOV OUT[27], IMM[26]
> 113: MOV OUT[28], IMM[27]
> 114: MOV OUT[29], IMM[28]
> 115: MOV OUT[30], IMM[29]
> 116: MOV OUT[31], IMM[30]
> 117: MOV OUT[32], IMM[31]
> 118: ISGE TEMP[0].x, CONST[0].xxxx, IMM[32].xxxx
> 119: UIF TEMP[0].xxxx :0
> 120:   UADD TEMP[0].x, CONST[0].xxxx, IMM[32].yyyy
> 121:   UARL ADDR[0].x, TEMP[0].xxxx
> 122:   MOV OUT[ADDR[0].x+17](2), IMM[33]
> 123: ELSE :0
> 124:   UARL ADDR[0].x, CONST[0].xxxx
> 125:   MOV OUT[ADDR[0].x+1](1), IMM[33]
> 126: ENDIF
> 127: MOV OUT[0], IN[2][0]
> 128: EMIT IMM[32].zzzz
> 129: END
> radeonsi: Compiling shader 2
> TGSI shader LLVM IR:
>
> ; ModuleID = 'tgsi'
> source_filename = "tgsi"
> target triple = "amdgcn--"
>
> define amdgpu_gs void @main([17 x <16 x i8>] addrspace(2)* byval dereferenceable(18446744073709551615), [16 x <16 x i8>] addrspace(2)* byval dereferenceable(18446744073709551615), [24 x <8 x i32>] addrspace(2)* byval dereferenceable(18446744073709551615), [16 x <8 x i32>] addrspace(2)* byval dereferenceable(18446744073709551615), [16 x <4 x i32>] addrspace(2)* byval dereferenceable(18446744073709551615), i32 inreg, i32 inreg, i32, i32, i32, i32, i32, i32, i32, i32) {
> main_body:
>   %15 = getelementptr [17 x <16 x i8>], [17 x <16 x i8>] addrspace(2)* %0, i64 0, i64 3, !amdgpu.uniform !0
>   %16 = load <16 x i8>, <16 x i8> addrspace(2)* %15, align 16, !invariant.load !0
>   %17 = getelementptr [17 x <16 x i8>], [17 x <16 x i8>] addrspace(2)* %0, i64 0, i64 4, !amdgpu.uniform !0
>   %18 = load <16 x i8>, <16 x i8> addrspace(2)* %17, align 16, !invariant.load !0
>   %19 = getelementptr [16 x <16 x i8>], [16 x <16 x i8>] addrspace(2)* %1, i64 0, i64 0, !amdgpu.uniform !0
>   %20 = load <16 x i8>, <16 x i8> addrspace(2)* %19, align 16, !invariant.load !0
>   %21 = call float @llvm.SI.load.const(<16 x i8> %20, i32 0)
>   %22 = bitcast float %21 to i32
>   %23 = icmp sgt i32 %22, 15
>   %24 = call float @llvm.SI.load.const(<16 x i8> %20, i32 0)
>   %25 = bitcast float %24 to i32
>   br i1 %23, label %if33, label %else37
>
> if33:                                             ; preds = %main_body
>   %26 = add i32 %25, 1
>   %27 = insertelement <33 x float> <float undef, float 1.000000e+00, float 2.000000e+00, float 3.000000e+00, float 4.000000e+00, float 5.000000e+00, float 6.000000e+00, float 7.000000e+00, float 8.000000e+00, float 9.000000e+00, float 1.000000e+01, float 1.100000e+01, float 1.200000e+01, float 1.300000e+01, float 1.400000e+01, float 1.500000e+01, float 1.600000e+01, float 1.700000e+01, float 1.800000e+01, float 1.900000e+01, float 2.000000e+01, float 2.100000e+01, float 2.200000e+01, float 2.300000e+01, float 2.400000e+01, float 2.500000e+01, float 2.600000e+01, float 2.700000e+01, float 2.800000e+01, float 2.900000e+01, float 3.000000e+01, float 3.100000e+01, float 3.200000e+01>, float 0.000000e+00, i32 %26
>   %28 = extractelement <33 x float> %27, i32 1
>   %29 = extractelement <33 x float> %27, i32 2
>   %30 = extractelement <33 x float> %27, i32 3
>   %31 = extractelement <33 x float> %27, i32 4
>   %32 = extractelement <33 x float> %27, i32 5
>   %33 = extractelement <33 x float> %27, i32 6
>   %34 = extractelement <33 x float> %27, i32 7
>   %35 = extractelement <33 x float> %27, i32 8
>   %36 = extractelement <33 x float> %27, i32 9
>   %37 = extractelement <33 x float> %27, i32 10
>   %38 = extractelement <33 x float> %27, i32 11
>   %39 = extractelement <33 x float> %27, i32 12
>   %40 = extractelement <33 x float> %27, i32 13
>   %41 = extractelement <33 x float> %27, i32 14
>   %42 = extractelement <33 x float> %27, i32 15
>   %43 = extractelement <33 x float> %27, i32 16
>   %44 = extractelement <33 x float> %27, i32 17
>   %45 = extractelement <33 x float> %27, i32 18
>   %46 = extractelement <33 x float> %27, i32 19
>   %47 = extractelement <33 x float> %27, i32 20
>   %48 = extractelement <33 x float> %27, i32 21
>   %49 = extractelement <33 x float> %27, i32 22
>   %50 = extractelement <33 x float> %27, i32 23
>   %51 = extractelement <33 x float> %27, i32 24
>   %52 = extractelement <33 x float> %27, i32 25
>   %53 = extractelement <33 x float> %27, i32 26
>   %54 = extractelement <33 x float> %27, i32 27
>   %55 = extractelement <33 x float> %27, i32 28
>   %56 = extractelement <33 x float> %27, i32 29
>   %57 = extractelement <33 x float> %27, i32 30
>   %58 = extractelement <33 x float> %27, i32 31
>   %59 = extractelement <33 x float> %27, i32 32
>   %60 = add i32 %25, 1
>   %61 = insertelement <33 x float> <float undef, float 0x3FF19999A0000000, float 0x4000CCCCC0000000, float 0x4008CCCCC0000000, float 0x4010666660000000, float 0x4014666660000000, float 0x4018666660000000, float 0x401C666660000000, float 0x4020333340000000, float 0x4022333340000000, float 0x4024333340000000, float 0x4026333340000000, float 0x4028333340000000, float 0x402A333340000000, float 0x402C333340000000, float 0x402E333340000000, float 0x40301999A0000000, float 0x40311999A0000000, float 0x40321999A0000000, float 0x40331999A0000000, float 0x40341999A0000000, float 0x40351999A0000000, float 0x40361999A0000000, float 0x40371999A0000000, float 0x40381999A0000000, float 0x40391999A0000000, float 0x403A1999A0000000, float 0x403B1999A0000000, float 0x403C1999A0000000, float 0x403D1999A0000000, float 0x403E1999A0000000, float 0x403F1999A0000000, float 0x40400CCCC0000000>, float 0x3FB99999A0000000, i32 %60
>   %62 = extractelement <33 x float> %61, i32 1
>   %63 = extractelement <33 x float> %61, i32 2
>   %64 = extractelement <33 x float> %61, i32 3
>   %65 = extractelement <33 x float> %61, i32 4
>   %66 = extractelement <33 x float> %61, i32 5
>   %67 = extractelement <33 x float> %61, i32 6
>   %68 = extractelement <33 x float> %61, i32 7
>   %69 = extractelement <33 x float> %61, i32 8
>   %70 = extractelement <33 x float> %61, i32 9
>   %71 = extractelement <33 x float> %61, i32 10
>   %72 = extractelement <33 x float> %61, i32 11
>   %73 = extractelement <33 x float> %61, i32 12
>   %74 = extractelement <33 x float> %61, i32 13
>   %75 = extractelement <33 x float> %61, i32 14
>   %76 = extractelement <33 x float> %61, i32 15
>   %77 = extractelement <33 x float> %61, i32 16
>   %78 = extractelement <33 x float> %61, i32 17
>   %79 = extractelement <33 x float> %61, i32 18
>   %80 = extractelement <33 x float> %61, i32 19
>   %81 = extractelement <33 x float> %61, i32 20
>   %82 = extractelement <33 x float> %61, i32 21
>   %83 = extractelement <33 x float> %61, i32 22
>   %84 = extractelement <33 x float> %61, i32 23
>   %85 = extractelement <33 x float> %61, i32 24
>   %86 = extractelement <33 x float> %61, i32 25
>   %87 = extractelement <33 x float> %61, i32 26
>   %88 = extractelement <33 x float> %61, i32 27
>   %89 = extractelement <33 x float> %61, i32 28
>   %90 = extractelement <33 x float> %61, i32 29
>   %91 = extractelement <33 x float> %61, i32 30
>   %92 = extractelement <33 x float> %61, i32 31
>   %93 = extractelement <33 x float> %61, i32 32
>   %94 = add i32 %25, 1
>   %95 = insertelement <33 x float> <float undef, float 0x3FF3333340000000, float 0x40019999A0000000, float 0x40099999A0000000, float 0x4010CCCCC0000000, float 0x4014CCCCC0000000, float 0x4018CCCCC0000000, float 0x401CCCCCC0000000, float 0x4020666660000000, float 0x4022666660000000, float 0x4024666660000000, float 0x4026666660000000, float 0x4028666660000000, float 0x402A666660000000, float 0x402C666660000000, float 0x402E666660000000, float 0x4030333340000000, float 0x4031333340000000, float 0x4032333340000000, float 0x4033333340000000, float 0x4034333340000000, float 0x4035333340000000, float 0x4036333340000000, float 0x4037333340000000, float 0x4038333340000000, float 0x4039333340000000, float 0x403A333340000000, float 0x403B333340000000, float 0x403C333340000000, float 0x403D333340000000, float 0x403E333340000000, float 0x403F333340000000, float 0x40401999A0000000>, float 0x3FC99999A0000000, i32 %94
>   %96 = extractelement <33 x float> %95, i32 1
>   %97 = extractelement <33 x float> %95, i32 2
>   %98 = extractelement <33 x float> %95, i32 3
>   %99 = extractelement <33 x float> %95, i32 4
>   %100 = extractelement <33 x float> %95, i32 5
>   %101 = extractelement <33 x float> %95, i32 6
>   %102 = extractelement <33 x float> %95, i32 7
>   %103 = extractelement <33 x float> %95, i32 8
>   %104 = extractelement <33 x float> %95, i32 9
>   %105 = extractelement <33 x float> %95, i32 10
>   %106 = extractelement <33 x float> %95, i32 11
>   %107 = extractelement <33 x float> %95, i32 12
>   %108 = extractelement <33 x float> %95, i32 13
>   %109 = extractelement <33 x float> %95, i32 14
>   %110 = extractelement <33 x float> %95, i32 15
>   %111 = extractelement <33 x float> %95, i32 16
>   %112 = extractelement <33 x float> %95, i32 17
>   %113 = extractelement <33 x float> %95, i32 18
>   %114 = extractelement <33 x float> %95, i32 19
>   %115 = extractelement <33 x float> %95, i32 20
>   %116 = extractelement <33 x float> %95, i32 21
>   %117 = extractelement <33 x float> %95, i32 22
>   %118 = extractelement <33 x float> %95, i32 23
>   %119 = extractelement <33 x float> %95, i32 24
>   %120 = extractelement <33 x float> %95, i32 25
>   %121 = extractelement <33 x float> %95, i32 26
>   %122 = extractelement <33 x float> %95, i32 27
>   %123 = extractelement <33 x float> %95, i32 28
>   %124 = extractelement <33 x float> %95, i32 29
>   %125 = extractelement <33 x float> %95, i32 30
>   %126 = extractelement <33 x float> %95, i32 31
>   %127 = extractelement <33 x float> %95, i32 32
>   br label %endif40
>
> else37:                                           ; preds = %main_body
>   %128 = add i32 %25, 1
>   %129 = insertelement <33 x float> <float undef, float 1.000000e+00, float 2.000000e+00, float 3.000000e+00, float 4.000000e+00, float 5.000000e+00, float 6.000000e+00, float 7.000000e+00, float 8.000000e+00, float 9.000000e+00, float 1.000000e+01, float 1.100000e+01, float 1.200000e+01, float 1.300000e+01, float 1.400000e+01, float 1.500000e+01, float 1.600000e+01, float 1.700000e+01, float 1.800000e+01, float 1.900000e+01, float 2.000000e+01, float 2.100000e+01, float 2.200000e+01, float 2.300000e+01, float 2.400000e+01, float 2.500000e+01, float 2.600000e+01, float 2.700000e+01, float 2.800000e+01, float 2.900000e+01, float 3.000000e+01, float 3.100000e+01, float 3.200000e+01>, float 0.000000e+00, i32 %128
>   %130 = extractelement <33 x float> %129, i32 1
>   %131 = extractelement <33 x float> %129, i32 2
>   %132 = extractelement <33 x float> %129, i32 3
>   %133 = extractelement <33 x float> %129, i32 4
>   %134 = extractelement <33 x float> %129, i32 5
>   %135 = extractelement <33 x float> %129, i32 6
>   %136 = extractelement <33 x float> %129, i32 7
>   %137 = extractelement <33 x float> %129, i32 8
>   %138 = extractelement <33 x float> %129, i32 9
>   %139 = extractelement <33 x float> %129, i32 10
>   %140 = extractelement <33 x float> %129, i32 11
>   %141 = extractelement <33 x float> %129, i32 12
>   %142 = extractelement <33 x float> %129, i32 13
>   %143 = extractelement <33 x float> %129, i32 14
>   %144 = extractelement <33 x float> %129, i32 15
>   %145 = extractelement <33 x float> %129, i32 16
>   %146 = extractelement <33 x float> %129, i32 17
>   %147 = extractelement <33 x float> %129, i32 18
>   %148 = extractelement <33 x float> %129, i32 19
>   %149 = extractelement <33 x float> %129, i32 20
>   %150 = extractelement <33 x float> %129, i32 21
>   %151 = extractelement <33 x float> %129, i32 22
>   %152 = extractelement <33 x float> %129, i32 23
>   %153 = extractelement <33 x float> %129, i32 24
>   %154 = extractelement <33 x float> %129, i32 25
>   %155 = extractelement <33 x float> %129, i32 26
>   %156 = extractelement <33 x float> %129, i32 27
>   %157 = extractelement <33 x float> %129, i32 28
>   %158 = extractelement <33 x float> %129, i32 29
>   %159 = extractelement <33 x float> %129, i32 30
>   %160 = extractelement <33 x float> %129, i32 31
>   %161 = extractelement <33 x float> %129, i32 32
>   %162 = add i32 %25, 1
>   %163 = insertelement <33 x float> <float undef, float 0x3FF19999A0000000, float 0x4000CCCCC0000000, float 0x4008CCCCC0000000, float 0x4010666660000000, float 0x4014666660000000, float 0x4018666660000000, float 0x401C666660000000, float 0x4020333340000000, float 0x4022333340000000, float 0x4024333340000000, float 0x4026333340000000, float 0x4028333340000000, float 0x402A333340000000, float 0x402C333340000000, float 0x402E333340000000, float 0x40301999A0000000, float 0x40311999A0000000, float 0x40321999A0000000, float 0x40331999A0000000, float 0x40341999A0000000, float 0x40351999A0000000, float 0x40361999A0000000, float 0x40371999A0000000, float 0x40381999A0000000, float 0x40391999A0000000, float 0x403A1999A0000000, float 0x403B1999A0000000, float 0x403C1999A0000000, float 0x403D1999A0000000, float 0x403E1999A0000000, float 0x403F1999A0000000, float 0x40400CCCC0000000>, float 0x3FB99999A0000000, i32 %162
>   %164 = extractelement <33 x float> %163, i32 1
>   %165 = extractelement <33 x float> %163, i32 2
>   %166 = extractelement <33 x float> %163, i32 3
>   %167 = extractelement <33 x float> %163, i32 4
>   %168 = extractelement <33 x float> %163, i32 5
>   %169 = extractelement <33 x float> %163, i32 6
>   %170 = extractelement <33 x float> %163, i32 7
>   %171 = extractelement <33 x float> %163, i32 8
>   %172 = extractelement <33 x float> %163, i32 9
>   %173 = extractelement <33 x float> %163, i32 10
>   %174 = extractelement <33 x float> %163, i32 11
>   %175 = extractelement <33 x float> %163, i32 12
>   %176 = extractelement <33 x float> %163, i32 13
>   %177 = extractelement <33 x float> %163, i32 14
>   %178 = extractelement <33 x float> %163, i32 15
>   %179 = extractelement <33 x float> %163, i32 16
>   %180 = extractelement <33 x float> %163, i32 17
>   %181 = extractelement <33 x float> %163, i32 18
>   %182 = extractelement <33 x float> %163, i32 19
>   %183 = extractelement <33 x float> %163, i32 20
>   %184 = extractelement <33 x float> %163, i32 21
>   %185 = extractelement <33 x float> %163, i32 22
>   %186 = extractelement <33 x float> %163, i32 23
>   %187 = extractelement <33 x float> %163, i32 24
>   %188 = extractelement <33 x float> %163, i32 25
>   %189 = extractelement <33 x float> %163, i32 26
>   %190 = extractelement <33 x float> %163, i32 27
>   %191 = extractelement <33 x float> %163, i32 28
>   %192 = extractelement <33 x float> %163, i32 29
>   %193 = extractelement <33 x float> %163, i32 30
>   %194 = extractelement <33 x float> %163, i32 31
>   %195 = extractelement <33 x float> %163, i32 32
>   %196 = add i32 %25, 1
>   %197 = insertelement <33 x float> <float undef, float 0x3FF3333340000000, float 0x40019999A0000000, float 0x40099999A0000000, float 0x4010CCCCC0000000, float 0x4014CCCCC0000000, float 0x4018CCCCC0000000, float 0x401CCCCCC0000000, float 0x4020666660000000, float 0x4022666660000000, float 0x4024666660000000, float 0x4026666660000000, float 0x4028666660000000, float 0x402A666660000000, float 0x402C666660000000, float 0x402E666660000000, float 0x4030333340000000, float 0x4031333340000000, float 0x4032333340000000, float 0x4033333340000000, float 0x4034333340000000, float 0x4035333340000000, float 0x4036333340000000, float 0x4037333340000000, float 0x4038333340000000, float 0x4039333340000000, float 0x403A333340000000, float 0x403B333340000000, float 0x403C333340000000, float 0x403D333340000000, float 0x403E333340000000, float 0x403F333340000000, float 0x40401999A0000000>, float 0x3FC99999A0000000, i32 %196
>   %198 = extractelement <33 x float> %197, i32 1
>   %199 = extractelement <33 x float> %197, i32 2
>   %200 = extractelement <33 x float> %197, i32 3
>   %201 = extractelement <33 x float> %197, i32 4
>   %202 = extractelement <33 x float> %197, i32 5
>   %203 = extractelement <33 x float> %197, i32 6
>   %204 = extractelement <33 x float> %197, i32 7
>   %205 = extractelement <33 x float> %197, i32 8
>   %206 = extractelement <33 x float> %197, i32 9
>   %207 = extractelement <33 x float> %197, i32 10
>   %208 = extractelement <33 x float> %197, i32 11
>   %209 = extractelement <33 x float> %197, i32 12
>   %210 = extractelement <33 x float> %197, i32 13
>   %211 = extractelement <33 x float> %197, i32 14
>   %212 = extractelement <33 x float> %197, i32 15
>   %213 = extractelement <33 x float> %197, i32 16
>   %214 = extractelement <33 x float> %197, i32 17
>   %215 = extractelement <33 x float> %197, i32 18
>   %216 = extractelement <33 x float> %197, i32 19
>   %217 = extractelement <33 x float> %197, i32 20
>   %218 = extractelement <33 x float> %197, i32 21
>   %219 = extractelement <33 x float> %197, i32 22
>   %220 = extractelement <33 x float> %197, i32 23
>   %221 = extractelement <33 x float> %197, i32 24
>   %222 = extractelement <33 x float> %197, i32 25
>   %223 = extractelement <33 x float> %197, i32 26
>   %224 = extractelement <33 x float> %197, i32 27
>   %225 = extractelement <33 x float> %197, i32 28
>   %226 = extractelement <33 x float> %197, i32 29
>   %227 = extractelement <33 x float> %197, i32 30
>   %228 = extractelement <33 x float> %197, i32 31
>   %229 = extractelement <33 x float> %197, i32 32
>   br label %endif40
>
> endif40:                                          ; preds = %else37, %if33
>   %OUT1.y.0 = phi float [ %164, %else37 ], [ %62, %if33 ]
>   %OUT1.z.0 = phi float [ %198, %else37 ], [ %96, %if33 ]
>   %OUT2.x.0 = phi float [ %131, %else37 ], [ %29, %if33 ]
>   %OUT2.y.0 = phi float [ %165, %else37 ], [ %63, %if33 ]
>   %OUT2.z.0 = phi float [ %199, %else37 ], [ %97, %if33 ]
>   %OUT3.x.0 = phi float [ %132, %else37 ], [ %30, %if33 ]
>   %OUT3.y.0 = phi float [ %166, %else37 ], [ %64, %if33 ]
>   %OUT3.z.0 = phi float [ %200, %else37 ], [ %98, %if33 ]
>   %OUT4.x.0 = phi float [ %133, %else37 ], [ %31, %if33 ]
>   %OUT4.y.0 = phi float [ %167, %else37 ], [ %65, %if33 ]
>   %OUT4.z.0 = phi float [ %201, %else37 ], [ %99, %if33 ]
>   %OUT5.x.0 = phi float [ %134, %else37 ], [ %32, %if33 ]
>   %OUT5.y.0 = phi float [ %168, %else37 ], [ %66, %if33 ]
>   %OUT5.z.0 = phi float [ %202, %else37 ], [ %100, %if33 ]
>   %OUT6.x.0 = phi float [ %135, %else37 ], [ %33, %if33 ]
>   %OUT6.y.0 = phi float [ %169, %else37 ], [ %67, %if33 ]
>   %OUT6.z.0 = phi float [ %203, %else37 ], [ %101, %if33 ]
>   %OUT7.x.0 = phi float [ %136, %else37 ], [ %34, %if33 ]
>   %OUT7.y.0 = phi float [ %170, %else37 ], [ %68, %if33 ]
>   %OUT7.z.0 = phi float [ %204, %else37 ], [ %102, %if33 ]
>   %OUT8.x.0 = phi float [ %137, %else37 ], [ %35, %if33 ]
>   %OUT8.y.0 = phi float [ %171, %else37 ], [ %69, %if33 ]
>   %OUT8.z.0 = phi float [ %205, %else37 ], [ %103, %if33 ]
>   %OUT9.x.0 = phi float [ %138, %else37 ], [ %36, %if33 ]
>   %OUT9.y.0 = phi float [ %172, %else37 ], [ %70, %if33 ]
>   %OUT9.z.0 = phi float [ %206, %else37 ], [ %104, %if33 ]
>   %OUT10.x.0 = phi float [ %139, %else37 ], [ %37, %if33 ]
>   %OUT10.y.0 = phi float [ %173, %else37 ], [ %71, %if33 ]
>   %OUT10.z.0 = phi float [ %207, %else37 ], [ %105, %if33 ]
>   %OUT11.x.0 = phi float [ %140, %else37 ], [ %38, %if33 ]
>   %OUT11.y.0 = phi float [ %174, %else37 ], [ %72, %if33 ]
>   %OUT11.z.0 = phi float [ %208, %else37 ], [ %106, %if33 ]
>   %OUT12.x.0 = phi float [ %141, %else37 ], [ %39, %if33 ]
>   %OUT12.y.0 = phi float [ %175, %else37 ], [ %73, %if33 ]
>   %OUT12.z.0 = phi float [ %209, %else37 ], [ %107, %if33 ]
>   %OUT13.x.0 = phi float [ %142, %else37 ], [ %40, %if33 ]
>   %OUT13.y.0 = phi float [ %176, %else37 ], [ %74, %if33 ]
>   %OUT13.z.0 = phi float [ %210, %else37 ], [ %108, %if33 ]
>   %OUT14.x.0 = phi float [ %143, %else37 ], [ %41, %if33 ]
>   %OUT14.y.0 = phi float [ %177, %else37 ], [ %75, %if33 ]
>   %OUT14.z.0 = phi float [ %211, %else37 ], [ %109, %if33 ]
>   %OUT15.x.0 = phi float [ %144, %else37 ], [ %42, %if33 ]
>   %OUT15.y.0 = phi float [ %178, %else37 ], [ %76, %if33 ]
>   %OUT15.z.0 = phi float [ %212, %else37 ], [ %110, %if33 ]
>   %OUT16.x.0 = phi float [ %145, %else37 ], [ %43, %if33 ]
>   %OUT16.y.0 = phi float [ %179, %else37 ], [ %77, %if33 ]
>   %OUT16.z.0 = phi float [ %213, %else37 ], [ %111, %if33 ]
>   %OUT17.x.0 = phi float [ %146, %else37 ], [ %44, %if33 ]
>   %OUT17.y.0 = phi float [ %180, %else37 ], [ %78, %if33 ]
>   %OUT17.z.0 = phi float [ %214, %else37 ], [ %112, %if33 ]
>   %OUT18.x.0 = phi float [ %147, %else37 ], [ %45, %if33 ]
>   %OUT18.y.0 = phi float [ %181, %else37 ], [ %79, %if33 ]
>   %OUT18.z.0 = phi float [ %215, %else37 ], [ %113, %if33 ]
>   %OUT19.x.0 = phi float [ %148, %else37 ], [ %46, %if33 ]
>   %OUT19.y.0 = phi float [ %182, %else37 ], [ %80, %if33 ]
>   %OUT19.z.0 = phi float [ %216, %else37 ], [ %114, %if33 ]
>   %OUT20.x.0 = phi float [ %149, %else37 ], [ %47, %if33 ]
>   %OUT20.y.0 = phi float [ %183, %else37 ], [ %81, %if33 ]
>   %OUT20.z.0 = phi float [ %217, %else37 ], [ %115, %if33 ]
>   %OUT21.x.0 = phi float [ %150, %else37 ], [ %48, %if33 ]
>   %OUT21.y.0 = phi float [ %184, %else37 ], [ %82, %if33 ]
>   %OUT21.z.0 = phi float [ %218, %else37 ], [ %116, %if33 ]
>   %OUT22.x.0 = phi float [ %151, %else37 ], [ %49, %if33 ]
>   %OUT22.y.0 = phi float [ %185, %else37 ], [ %83, %if33 ]
>   %OUT22.z.0 = phi float [ %219, %else37 ], [ %117, %if33 ]
>   %OUT23.x.0 = phi float [ %152, %else37 ], [ %50, %if33 ]
>   %OUT23.y.0 = phi float [ %186, %else37 ], [ %84, %if33 ]
>   %OUT23.z.0 = phi float [ %220, %else37 ], [ %118, %if33 ]
>   %OUT24.x.0 = phi float [ %153, %else37 ], [ %51, %if33 ]
>   %OUT24.y.0 = phi float [ %187, %else37 ], [ %85, %if33 ]
>   %OUT24.z.0 = phi float [ %221, %else37 ], [ %119, %if33 ]
>   %OUT25.x.0 = phi float [ %154, %else37 ], [ %52, %if33 ]
>   %OUT25.y.0 = phi float [ %188, %else37 ], [ %86, %if33 ]
>   %OUT25.z.0 = phi float [ %222, %else37 ], [ %120, %if33 ]
>   %OUT26.x.0 = phi float [ %155, %else37 ], [ %53, %if33 ]
>   %OUT26.y.0 = phi float [ %189, %else37 ], [ %87, %if33 ]
>   %OUT26.z.0 = phi float [ %223, %else37 ], [ %121, %if33 ]
>   %OUT27.x.0 = phi float [ %156, %else37 ], [ %54, %if33 ]
>   %OUT27.y.0 = phi float [ %190, %else37 ], [ %88, %if33 ]
>   %OUT27.z.0 = phi float [ %224, %else37 ], [ %122, %if33 ]
>   %OUT28.x.0 = phi float [ %157, %else37 ], [ %55, %if33 ]
>   %OUT28.y.0 = phi float [ %191, %else37 ], [ %89, %if33 ]
>   %OUT28.z.0 = phi float [ %225, %else37 ], [ %123, %if33 ]
>   %OUT29.x.0 = phi float [ %158, %else37 ], [ %56, %if33 ]
>   %OUT29.y.0 = phi float [ %192, %else37 ], [ %90, %if33 ]
>   %OUT29.z.0 = phi float [ %226, %else37 ], [ %124, %if33 ]
>   %OUT30.x.0 = phi float [ %159, %else37 ], [ %57, %if33 ]
>   %OUT30.y.0 = phi float [ %193, %else37 ], [ %91, %if33 ]
>   %OUT30.z.0 = phi float [ %227, %else37 ], [ %125, %if33 ]
>   %OUT31.x.0 = phi float [ %160, %else37 ], [ %58, %if33 ]
>   %OUT31.y.0 = phi float [ %194, %else37 ], [ %92, %if33 ]
>   %OUT31.z.0 = phi float [ %228, %else37 ], [ %126, %if33 ]
>   %OUT32.x.0 = phi float [ %161, %else37 ], [ %59, %if33 ]
>   %OUT32.y.0 = phi float [ %195, %else37 ], [ %93, %if33 ]
>   %OUT32.z.0 = phi float [ %229, %else37 ], [ %127, %if33 ]
>   %OUT1.x.0 = phi float [ %130, %else37 ], [ %28, %if33 ]
>   %.sink = add i32 %25, 1
>   %230 = insertelement <33 x float> <float undef, float 0x3FF4CCCCC0000000, float 0x4002666660000000, float 0x400A666660000000, float 0x4011333340000000, float 0x4015333340000000, float 0x4019333340000000, float 0x401D333340000000, float 0x40209999A0000000, float 0x40229999A0000000, float 0x40249999A0000000, float 0x40269999A0000000, float 0x40289999A0000000, float 0x402A9999A0000000, float 0x402C9999A0000000, float 0x402E9999A0000000, float 0x40304CCCC0000000, float 0x40314CCCC0000000, float 0x40324CCCC0000000, float 0x40334CCCC0000000, float 0x40344CCCC0000000, float 0x40354CCCC0000000, float 0x40364CCCC0000000, float 0x40374CCCC0000000, float 0x40384CCCC0000000, float 0x40394CCCC0000000, float 0x403A4CCCC0000000, float 0x403B4CCCC0000000, float 0x403C4CCCC0000000, float 0x403D4CCCC0000000, float 0x403E4CCCC0000000, float 0x403F4CCCC0000000, float 0x4040266660000000>, float 0x3FD3333340000000, i32 %.sink
>   %231 = shl i32 %7, 2
>   %232 = call i32 @llvm.SI.buffer.load.dword.i32.i32(<16 x i8> %16, i32 %231, i32 0, i32 0, i32 1, i32 0, i32 1, i32 0, i32 0)
>   %233 = bitcast i32 %232 to float
>   %234 = shl i32 %7, 2
>   %235 = call i32 @llvm.SI.buffer.load.dword.i32.i32(<16 x i8> %16, i32 %234, i32 256, i32 0, i32 1, i32 0, i32 1, i32 0, i32 0)
>   %236 = bitcast i32 %235 to float
>   %237 = shl i32 %7, 2
>   %238 = call i32 @llvm.SI.buffer.load.dword.i32.i32(<16 x i8> %16, i32 %237, i32 512, i32 0, i32 1, i32 0, i32 1, i32 0, i32 0)
>   %239 = bitcast i32 %238 to float
>   %240 = shl i32 %7, 2
>   %241 = call i32 @llvm.SI.buffer.load.dword.i32.i32(<16 x i8> %16, i32 %240, i32 768, i32 0, i32 1, i32 0, i32 1, i32 0, i32 0)
>   %242 = bitcast i32 %241 to float
>   call void @llvm.AMDGPU.kill(float 1.000000e+00)
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %232, i32 1, i32 0, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %235, i32 1, i32 12, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %238, i32 1, i32 24, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %241, i32 1, i32 36, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %243 = bitcast float %OUT1.x.0 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %243, i32 1, i32 48, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %244 = bitcast float %OUT1.y.0 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %244, i32 1, i32 60, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %245 = bitcast float %OUT1.z.0 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %245, i32 1, i32 72, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %bc = bitcast <33 x float> %230 to <33 x i32>
>   %246 = extractelement <33 x i32> %bc, i32 1
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %246, i32 1, i32 84, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %247 = bitcast float %OUT2.x.0 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %247, i32 1, i32 96, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %248 = bitcast float %OUT2.y.0 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %248, i32 1, i32 108, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %249 = bitcast float %OUT2.z.0 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %249, i32 1, i32 120, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %bc887 = bitcast <33 x float> %230 to <33 x i32>
>   %250 = extractelement <33 x i32> %bc887, i32 2
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %250, i32 1, i32 132, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %251 = bitcast float %OUT3.x.0 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %251, i32 1, i32 144, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %252 = bitcast float %OUT3.y.0 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %252, i32 1, i32 156, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %253 = bitcast float %OUT3.z.0 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %253, i32 1, i32 168, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %bc888 = bitcast <33 x float> %230 to <33 x i32>
>   %254 = extractelement <33 x i32> %bc888, i32 3
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %254, i32 1, i32 180, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %255 = bitcast float %OUT4.x.0 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %255, i32 1, i32 192, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %256 = bitcast float %OUT4.y.0 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %256, i32 1, i32 204, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %257 = bitcast float %OUT4.z.0 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %257, i32 1, i32 216, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %bc889 = bitcast <33 x float> %230 to <33 x i32>
>   %258 = extractelement <33 x i32> %bc889, i32 4
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %258, i32 1, i32 228, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %259 = bitcast float %OUT5.x.0 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %259, i32 1, i32 240, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %260 = bitcast float %OUT5.y.0 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %260, i32 1, i32 252, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %261 = bitcast float %OUT5.z.0 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %261, i32 1, i32 264, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %bc890 = bitcast <33 x float> %230 to <33 x i32>
>   %262 = extractelement <33 x i32> %bc890, i32 5
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %262, i32 1, i32 276, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %263 = bitcast float %OUT6.x.0 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %263, i32 1, i32 288, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %264 = bitcast float %OUT6.y.0 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %264, i32 1, i32 300, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %265 = bitcast float %OUT6.z.0 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %265, i32 1, i32 312, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %bc891 = bitcast <33 x float> %230 to <33 x i32>
>   %266 = extractelement <33 x i32> %bc891, i32 6
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %266, i32 1, i32 324, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %267 = bitcast float %OUT7.x.0 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %267, i32 1, i32 336, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %268 = bitcast float %OUT7.y.0 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %268, i32 1, i32 348, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %269 = bitcast float %OUT7.z.0 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %269, i32 1, i32 360, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %bc892 = bitcast <33 x float> %230 to <33 x i32>
>   %270 = extractelement <33 x i32> %bc892, i32 7
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %270, i32 1, i32 372, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %271 = bitcast float %OUT8.x.0 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %271, i32 1, i32 384, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %272 = bitcast float %OUT8.y.0 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %272, i32 1, i32 396, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %273 = bitcast float %OUT8.z.0 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %273, i32 1, i32 408, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %bc893 = bitcast <33 x float> %230 to <33 x i32>
>   %274 = extractelement <33 x i32> %bc893, i32 8
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %274, i32 1, i32 420, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %275 = bitcast float %OUT9.x.0 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %275, i32 1, i32 432, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %276 = bitcast float %OUT9.y.0 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %276, i32 1, i32 444, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %277 = bitcast float %OUT9.z.0 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %277, i32 1, i32 456, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %bc894 = bitcast <33 x float> %230 to <33 x i32>
>   %278 = extractelement <33 x i32> %bc894, i32 9
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %278, i32 1, i32 468, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %279 = bitcast float %OUT10.x.0 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %279, i32 1, i32 480, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %280 = bitcast float %OUT10.y.0 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %280, i32 1, i32 492, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %281 = bitcast float %OUT10.z.0 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %281, i32 1, i32 504, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %bc895 = bitcast <33 x float> %230 to <33 x i32>
>   %282 = extractelement <33 x i32> %bc895, i32 10
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %282, i32 1, i32 516, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %283 = bitcast float %OUT11.x.0 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %283, i32 1, i32 528, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %284 = bitcast float %OUT11.y.0 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %284, i32 1, i32 540, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %285 = bitcast float %OUT11.z.0 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %285, i32 1, i32 552, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %bc896 = bitcast <33 x float> %230 to <33 x i32>
>   %286 = extractelement <33 x i32> %bc896, i32 11
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %286, i32 1, i32 564, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %287 = bitcast float %OUT12.x.0 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %287, i32 1, i32 576, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %288 = bitcast float %OUT12.y.0 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %288, i32 1, i32 588, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %289 = bitcast float %OUT12.z.0 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %289, i32 1, i32 600, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %bc897 = bitcast <33 x float> %230 to <33 x i32>
>   %290 = extractelement <33 x i32> %bc897, i32 12
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %290, i32 1, i32 612, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %291 = bitcast float %OUT13.x.0 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %291, i32 1, i32 624, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %292 = bitcast float %OUT13.y.0 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %292, i32 1, i32 636, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %293 = bitcast float %OUT13.z.0 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %293, i32 1, i32 648, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %bc898 = bitcast <33 x float> %230 to <33 x i32>
>   %294 = extractelement <33 x i32> %bc898, i32 13
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %294, i32 1, i32 660, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %295 = bitcast float %OUT14.x.0 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %295, i32 1, i32 672, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %296 = bitcast float %OUT14.y.0 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %296, i32 1, i32 684, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %297 = bitcast float %OUT14.z.0 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %297, i32 1, i32 696, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %bc899 = bitcast <33 x float> %230 to <33 x i32>
>   %298 = extractelement <33 x i32> %bc899, i32 14
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %298, i32 1, i32 708, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %299 = bitcast float %OUT15.x.0 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %299, i32 1, i32 720, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %300 = bitcast float %OUT15.y.0 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %300, i32 1, i32 732, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %301 = bitcast float %OUT15.z.0 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %301, i32 1, i32 744, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %bc900 = bitcast <33 x float> %230 to <33 x i32>
>   %302 = extractelement <33 x i32> %bc900, i32 15
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %302, i32 1, i32 756, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %303 = bitcast float %OUT16.x.0 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %303, i32 1, i32 768, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %304 = bitcast float %OUT16.y.0 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %304, i32 1, i32 780, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %305 = bitcast float %OUT16.z.0 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %305, i32 1, i32 792, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %bc901 = bitcast <33 x float> %230 to <33 x i32>
>   %306 = extractelement <33 x i32> %bc901, i32 16
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %306, i32 1, i32 804, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %307 = bitcast float %OUT17.x.0 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %307, i32 1, i32 816, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %308 = bitcast float %OUT17.y.0 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %308, i32 1, i32 828, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %309 = bitcast float %OUT17.z.0 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %309, i32 1, i32 840, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %bc902 = bitcast <33 x float> %230 to <33 x i32>
>   %310 = extractelement <33 x i32> %bc902, i32 17
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %310, i32 1, i32 852, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %311 = bitcast float %OUT18.x.0 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %311, i32 1, i32 864, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %312 = bitcast float %OUT18.y.0 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %312, i32 1, i32 876, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %313 = bitcast float %OUT18.z.0 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %313, i32 1, i32 888, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %bc903 = bitcast <33 x float> %230 to <33 x i32>
>   %314 = extractelement <33 x i32> %bc903, i32 18
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %314, i32 1, i32 900, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %315 = bitcast float %OUT19.x.0 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %315, i32 1, i32 912, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %316 = bitcast float %OUT19.y.0 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %316, i32 1, i32 924, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %317 = bitcast float %OUT19.z.0 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %317, i32 1, i32 936, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %bc904 = bitcast <33 x float> %230 to <33 x i32>
>   %318 = extractelement <33 x i32> %bc904, i32 19
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %318, i32 1, i32 948, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %319 = bitcast float %OUT20.x.0 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %319, i32 1, i32 960, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %320 = bitcast float %OUT20.y.0 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %320, i32 1, i32 972, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %321 = bitcast float %OUT20.z.0 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %321, i32 1, i32 984, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %bc905 = bitcast <33 x float> %230 to <33 x i32>
>   %322 = extractelement <33 x i32> %bc905, i32 20
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %322, i32 1, i32 996, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %323 = bitcast float %OUT21.x.0 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %323, i32 1, i32 1008, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %324 = bitcast float %OUT21.y.0 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %324, i32 1, i32 1020, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %325 = bitcast float %OUT21.z.0 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %325, i32 1, i32 1032, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %bc906 = bitcast <33 x float> %230 to <33 x i32>
>   %326 = extractelement <33 x i32> %bc906, i32 21
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %326, i32 1, i32 1044, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %327 = bitcast float %OUT22.x.0 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %327, i32 1, i32 1056, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %328 = bitcast float %OUT22.y.0 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %328, i32 1, i32 1068, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %329 = bitcast float %OUT22.z.0 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %329, i32 1, i32 1080, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %bc907 = bitcast <33 x float> %230 to <33 x i32>
>   %330 = extractelement <33 x i32> %bc907, i32 22
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %330, i32 1, i32 1092, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %331 = bitcast float %OUT23.x.0 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %331, i32 1, i32 1104, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %332 = bitcast float %OUT23.y.0 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %332, i32 1, i32 1116, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %333 = bitcast float %OUT23.z.0 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %333, i32 1, i32 1128, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %bc908 = bitcast <33 x float> %230 to <33 x i32>
>   %334 = extractelement <33 x i32> %bc908, i32 23
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %334, i32 1, i32 1140, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %335 = bitcast float %OUT24.x.0 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %335, i32 1, i32 1152, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %336 = bitcast float %OUT24.y.0 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %336, i32 1, i32 1164, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %337 = bitcast float %OUT24.z.0 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %337, i32 1, i32 1176, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %bc909 = bitcast <33 x float> %230 to <33 x i32>
>   %338 = extractelement <33 x i32> %bc909, i32 24
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %338, i32 1, i32 1188, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %339 = bitcast float %OUT25.x.0 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %339, i32 1, i32 1200, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %340 = bitcast float %OUT25.y.0 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %340, i32 1, i32 1212, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %341 = bitcast float %OUT25.z.0 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %341, i32 1, i32 1224, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %bc910 = bitcast <33 x float> %230 to <33 x i32>
>   %342 = extractelement <33 x i32> %bc910, i32 25
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %342, i32 1, i32 1236, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %343 = bitcast float %OUT26.x.0 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %343, i32 1, i32 1248, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %344 = bitcast float %OUT26.y.0 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %344, i32 1, i32 1260, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %345 = bitcast float %OUT26.z.0 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %345, i32 1, i32 1272, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %bc911 = bitcast <33 x float> %230 to <33 x i32>
>   %346 = extractelement <33 x i32> %bc911, i32 26
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %346, i32 1, i32 1284, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %347 = bitcast float %OUT27.x.0 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %347, i32 1, i32 1296, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %348 = bitcast float %OUT27.y.0 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %348, i32 1, i32 1308, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %349 = bitcast float %OUT27.z.0 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %349, i32 1, i32 1320, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %bc912 = bitcast <33 x float> %230 to <33 x i32>
>   %350 = extractelement <33 x i32> %bc912, i32 27
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %350, i32 1, i32 1332, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %351 = bitcast float %OUT28.x.0 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %351, i32 1, i32 1344, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %352 = bitcast float %OUT28.y.0 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %352, i32 1, i32 1356, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %353 = bitcast float %OUT28.z.0 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %353, i32 1, i32 1368, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %bc913 = bitcast <33 x float> %230 to <33 x i32>
>   %354 = extractelement <33 x i32> %bc913, i32 28
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %354, i32 1, i32 1380, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %355 = bitcast float %OUT29.x.0 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %355, i32 1, i32 1392, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %356 = bitcast float %OUT29.y.0 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %356, i32 1, i32 1404, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %357 = bitcast float %OUT29.z.0 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %357, i32 1, i32 1416, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %bc914 = bitcast <33 x float> %230 to <33 x i32>
>   %358 = extractelement <33 x i32> %bc914, i32 29
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %358, i32 1, i32 1428, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %359 = bitcast float %OUT30.x.0 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %359, i32 1, i32 1440, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %360 = bitcast float %OUT30.y.0 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %360, i32 1, i32 1452, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %361 = bitcast float %OUT30.z.0 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %361, i32 1, i32 1464, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %bc915 = bitcast <33 x float> %230 to <33 x i32>
>   %362 = extractelement <33 x i32> %bc915, i32 30
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %362, i32 1, i32 1476, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %363 = bitcast float %OUT31.x.0 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %363, i32 1, i32 1488, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %364 = bitcast float %OUT31.y.0 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %364, i32 1, i32 1500, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %365 = bitcast float %OUT31.z.0 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %365, i32 1, i32 1512, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %bc916 = bitcast <33 x float> %230 to <33 x i32>
>   %366 = extractelement <33 x i32> %bc916, i32 31
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %366, i32 1, i32 1524, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %367 = bitcast float %OUT32.x.0 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %367, i32 1, i32 1536, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %368 = bitcast float %OUT32.y.0 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %368, i32 1, i32 1548, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %369 = bitcast float %OUT32.z.0 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %369, i32 1, i32 1560, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %bc917 = bitcast <33 x float> %230 to <33 x i32>
>   %370 = extractelement <33 x i32> %bc917, i32 32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %370, i32 1, i32 1572, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   call void @llvm.SI.sendmsg(i32 34, i32 %6)
>   %371 = getelementptr [16 x <16 x i8>], [16 x <16 x i8>] addrspace(2)* %1, i64 0, i64 0, !amdgpu.uniform !0
>   %372 = load <16 x i8>, <16 x i8> addrspace(2)* %371, align 16, !invariant.load !0
>   %373 = call float @llvm.SI.load.const(<16 x i8> %372, i32 0)
>   %374 = bitcast float %373 to i32
>   %375 = icmp sgt i32 %374, 15
>   %376 = call float @llvm.SI.load.const(<16 x i8> %372, i32 0)
>   %377 = bitcast float %376 to i32
>   br i1 %375, label %if76, label %else80
>
> if76:                                             ; preds = %endif40
>   %378 = add i32 %377, 1
>   %array_vector264 = insertelement <33 x float> undef, float %233, i32 0
>   %array_vector296 = shufflevector <33 x float> %array_vector264, <33 x float> <float undef, float 1.000000e+00, float 2.000000e+00, float 3.000000e+00, float 4.000000e+00, float 5.000000e+00, float 6.000000e+00, float 7.000000e+00, float 8.000000e+00, float 9.000000e+00, float 1.000000e+01, float 1.100000e+01, float 1.200000e+01, float 1.300000e+01, float 1.400000e+01, float 1.500000e+01, float 1.600000e+01, float 1.700000e+01, float 1.800000e+01, float 1.900000e+01, float 2.000000e+01, float 2.100000e+01, float 2.200000e+01, float 2.300000e+01, float 2.400000e+01, float 2.500000e+01, float 2.600000e+01, float 2.700000e+01, float 2.800000e+01, float 2.900000e+01, float 3.000000e+01, float 3.100000e+01, float 3.200000e+01>, <33 x i32> <i32 0, i32 34, i32 35, i32 36, i32 37, i32 38, i32 39, i32 40, i32 41, i32 42, i32 43, i32 44, i32 45, i32 46, i32 47, i32 48, i32 49, i32 50, i32 51, i32 52, i32 53, i32 54, i32 55, i32 56, i32 57, i32 58, i32 59, i32 60, i32 61, i32 62, i32 63, i32 64, i32 65>
>   %379 = insertelement <33 x float> %array_vector296, float 0.000000e+00, i32 %378
>   %380 = extractelement <33 x float> %379, i32 1
>   %381 = extractelement <33 x float> %379, i32 2
>   %382 = extractelement <33 x float> %379, i32 3
>   %383 = extractelement <33 x float> %379, i32 4
>   %384 = extractelement <33 x float> %379, i32 5
>   %385 = extractelement <33 x float> %379, i32 6
>   %386 = extractelement <33 x float> %379, i32 7
>   %387 = extractelement <33 x float> %379, i32 8
>   %388 = extractelement <33 x float> %379, i32 9
>   %389 = extractelement <33 x float> %379, i32 10
>   %390 = extractelement <33 x float> %379, i32 11
>   %391 = extractelement <33 x float> %379, i32 12
>   %392 = extractelement <33 x float> %379, i32 13
>   %393 = extractelement <33 x float> %379, i32 14
>   %394 = extractelement <33 x float> %379, i32 15
>   %395 = extractelement <33 x float> %379, i32 16
>   %396 = extractelement <33 x float> %379, i32 17
>   %397 = extractelement <33 x float> %379, i32 18
>   %398 = extractelement <33 x float> %379, i32 19
>   %399 = extractelement <33 x float> %379, i32 20
>   %400 = extractelement <33 x float> %379, i32 21
>   %401 = extractelement <33 x float> %379, i32 22
>   %402 = extractelement <33 x float> %379, i32 23
>   %403 = extractelement <33 x float> %379, i32 24
>   %404 = extractelement <33 x float> %379, i32 25
>   %405 = extractelement <33 x float> %379, i32 26
>   %406 = extractelement <33 x float> %379, i32 27
>   %407 = extractelement <33 x float> %379, i32 28
>   %408 = extractelement <33 x float> %379, i32 29
>   %409 = extractelement <33 x float> %379, i32 30
>   %410 = extractelement <33 x float> %379, i32 31
>   %411 = extractelement <33 x float> %379, i32 32
>   %412 = add i32 %377, 1
>   %array_vector297 = insertelement <33 x float> undef, float %236, i32 0
>   %array_vector329 = shufflevector <33 x float> %array_vector297, <33 x float> <float undef, float 0x3FF19999A0000000, float 0x4000CCCCC0000000, float 0x4008CCCCC0000000, float 0x4010666660000000, float 0x4014666660000000, float 0x4018666660000000, float 0x401C666660000000, float 0x4020333340000000, float 0x4022333340000000, float 0x4024333340000000, float 0x4026333340000000, float 0x4028333340000000, float 0x402A333340000000, float 0x402C333340000000, float 0x402E333340000000, float 0x40301999A0000000, float 0x40311999A0000000, float 0x40321999A0000000, float 0x40331999A0000000, float 0x40341999A0000000, float 0x40351999A0000000, float 0x40361999A0000000, float 0x40371999A0000000, float 0x40381999A0000000, float 0x40391999A0000000, float 0x403A1999A0000000, float 0x403B1999A0000000, float 0x403C1999A0000000, float 0x403D1999A0000000, float 0x403E1999A0000000, float 0x403F1999A0000000, float 0x40400CCCC0000000>, <33 x i32> <i32 0, i32 34, i32 35, i32 36, i32 37, i32 38, i32 39, i32 40, i32 41, i32 42, i32 43, i32 44, i32 45, i32 46, i32 47, i32 48, i32 49, i32 50, i32 51, i32 52, i32 53, i32 54, i32 55, i32 56, i32 57, i32 58, i32 59, i32 60, i32 61, i32 62, i32 63, i32 64, i32 65>
>   %413 = insertelement <33 x float> %array_vector329, float 0x3FB99999A0000000, i32 %412
>   %414 = extractelement <33 x float> %413, i32 1
>   %415 = extractelement <33 x float> %413, i32 2
>   %416 = extractelement <33 x float> %413, i32 3
>   %417 = extractelement <33 x float> %413, i32 4
>   %418 = extractelement <33 x float> %413, i32 5
>   %419 = extractelement <33 x float> %413, i32 6
>   %420 = extractelement <33 x float> %413, i32 7
>   %421 = extractelement <33 x float> %413, i32 8
>   %422 = extractelement <33 x float> %413, i32 9
>   %423 = extractelement <33 x float> %413, i32 10
>   %424 = extractelement <33 x float> %413, i32 11
>   %425 = extractelement <33 x float> %413, i32 12
>   %426 = extractelement <33 x float> %413, i32 13
>   %427 = extractelement <33 x float> %413, i32 14
>   %428 = extractelement <33 x float> %413, i32 15
>   %429 = extractelement <33 x float> %413, i32 16
>   %430 = extractelement <33 x float> %413, i32 17
>   %431 = extractelement <33 x float> %413, i32 18
>   %432 = extractelement <33 x float> %413, i32 19
>   %433 = extractelement <33 x float> %413, i32 20
>   %434 = extractelement <33 x float> %413, i32 21
>   %435 = extractelement <33 x float> %413, i32 22
>   %436 = extractelement <33 x float> %413, i32 23
>   %437 = extractelement <33 x float> %413, i32 24
>   %438 = extractelement <33 x float> %413, i32 25
>   %439 = extractelement <33 x float> %413, i32 26
>   %440 = extractelement <33 x float> %413, i32 27
>   %441 = extractelement <33 x float> %413, i32 28
>   %442 = extractelement <33 x float> %413, i32 29
>   %443 = extractelement <33 x float> %413, i32 30
>   %444 = extractelement <33 x float> %413, i32 31
>   %445 = extractelement <33 x float> %413, i32 32
>   %446 = add i32 %377, 1
>   %array_vector330 = insertelement <33 x float> undef, float %239, i32 0
>   %array_vector362 = shufflevector <33 x float> %array_vector330, <33 x float> <float undef, float 0x3FF3333340000000, float 0x40019999A0000000, float 0x40099999A0000000, float 0x4010CCCCC0000000, float 0x4014CCCCC0000000, float 0x4018CCCCC0000000, float 0x401CCCCCC0000000, float 0x4020666660000000, float 0x4022666660000000, float 0x4024666660000000, float 0x4026666660000000, float 0x4028666660000000, float 0x402A666660000000, float 0x402C666660000000, float 0x402E666660000000, float 0x4030333340000000, float 0x4031333340000000, float 0x4032333340000000, float 0x4033333340000000, float 0x4034333340000000, float 0x4035333340000000, float 0x4036333340000000, float 0x4037333340000000, float 0x4038333340000000, float 0x4039333340000000, float 0x403A333340000000, float 0x403B333340000000, float 0x403C333340000000, float 0x403D333340000000, float 0x403E333340000000, float 0x403F333340000000, float 0x40401999A0000000>, <33 x i32> <i32 0, i32 34, i32 35, i32 36, i32 37, i32 38, i32 39, i32 40, i32 41, i32 42, i32 43, i32 44, i32 45, i32 46, i32 47, i32 48, i32 49, i32 50, i32 51, i32 52, i32 53, i32 54, i32 55, i32 56, i32 57, i32 58, i32 59, i32 60, i32 61, i32 62, i32 63, i32 64, i32 65>
>   %447 = insertelement <33 x float> %array_vector362, float 0x3FC99999A0000000, i32 %446
>   %448 = extractelement <33 x float> %447, i32 1
>   %449 = extractelement <33 x float> %447, i32 2
>   %450 = extractelement <33 x float> %447, i32 3
>   %451 = extractelement <33 x float> %447, i32 4
>   %452 = extractelement <33 x float> %447, i32 5
>   %453 = extractelement <33 x float> %447, i32 6
>   %454 = extractelement <33 x float> %447, i32 7
>   %455 = extractelement <33 x float> %447, i32 8
>   %456 = extractelement <33 x float> %447, i32 9
>   %457 = extractelement <33 x float> %447, i32 10
>   %458 = extractelement <33 x float> %447, i32 11
>   %459 = extractelement <33 x float> %447, i32 12
>   %460 = extractelement <33 x float> %447, i32 13
>   %461 = extractelement <33 x float> %447, i32 14
>   %462 = extractelement <33 x float> %447, i32 15
>   %463 = extractelement <33 x float> %447, i32 16
>   %464 = extractelement <33 x float> %447, i32 17
>   %465 = extractelement <33 x float> %447, i32 18
>   %466 = extractelement <33 x float> %447, i32 19
>   %467 = extractelement <33 x float> %447, i32 20
>   %468 = extractelement <33 x float> %447, i32 21
>   %469 = extractelement <33 x float> %447, i32 22
>   %470 = extractelement <33 x float> %447, i32 23
>   %471 = extractelement <33 x float> %447, i32 24
>   %472 = extractelement <33 x float> %447, i32 25
>   %473 = extractelement <33 x float> %447, i32 26
>   %474 = extractelement <33 x float> %447, i32 27
>   %475 = extractelement <33 x float> %447, i32 28
>   %476 = extractelement <33 x float> %447, i32 29
>   %477 = extractelement <33 x float> %447, i32 30
>   %478 = extractelement <33 x float> %447, i32 31
>   %479 = extractelement <33 x float> %447, i32 32
>   br label %endif83
>
> else80:                                           ; preds = %endif40
>   %480 = add i32 %377, 1
>   %array_vector396 = insertelement <33 x float> undef, float %233, i32 0
>   %array_vector428 = shufflevector <33 x float> %array_vector396, <33 x float> <float undef, float 1.000000e+00, float 2.000000e+00, float 3.000000e+00, float 4.000000e+00, float 5.000000e+00, float 6.000000e+00, float 7.000000e+00, float 8.000000e+00, float 9.000000e+00, float 1.000000e+01, float 1.100000e+01, float 1.200000e+01, float 1.300000e+01, float 1.400000e+01, float 1.500000e+01, float 1.600000e+01, float 1.700000e+01, float 1.800000e+01, float 1.900000e+01, float 2.000000e+01, float 2.100000e+01, float 2.200000e+01, float 2.300000e+01, float 2.400000e+01, float 2.500000e+01, float 2.600000e+01, float 2.700000e+01, float 2.800000e+01, float 2.900000e+01, float 3.000000e+01, float 3.100000e+01, float 3.200000e+01>, <33 x i32> <i32 0, i32 34, i32 35, i32 36, i32 37, i32 38, i32 39, i32 40, i32 41, i32 42, i32 43, i32 44, i32 45, i32 46, i32 47, i32 48, i32 49, i32 50, i32 51, i32 52, i32 53, i32 54, i32 55, i32 56, i32 57, i32 58, i32 59, i32 60, i32 61, i32 62, i32 63, i32 64, i32 65>
>   %481 = insertelement <33 x float> %array_vector428, float 0.000000e+00, i32 %480
>   %482 = extractelement <33 x float> %481, i32 1
>   %483 = extractelement <33 x float> %481, i32 2
>   %484 = extractelement <33 x float> %481, i32 3
>   %485 = extractelement <33 x float> %481, i32 4
>   %486 = extractelement <33 x float> %481, i32 5
>   %487 = extractelement <33 x float> %481, i32 6
>   %488 = extractelement <33 x float> %481, i32 7
>   %489 = extractelement <33 x float> %481, i32 8
>   %490 = extractelement <33 x float> %481, i32 9
>   %491 = extractelement <33 x float> %481, i32 10
>   %492 = extractelement <33 x float> %481, i32 11
>   %493 = extractelement <33 x float> %481, i32 12
>   %494 = extractelement <33 x float> %481, i32 13
>   %495 = extractelement <33 x float> %481, i32 14
>   %496 = extractelement <33 x float> %481, i32 15
>   %497 = extractelement <33 x float> %481, i32 16
>   %498 = extractelement <33 x float> %481, i32 17
>   %499 = extractelement <33 x float> %481, i32 18
>   %500 = extractelement <33 x float> %481, i32 19
>   %501 = extractelement <33 x float> %481, i32 20
>   %502 = extractelement <33 x float> %481, i32 21
>   %503 = extractelement <33 x float> %481, i32 22
>   %504 = extractelement <33 x float> %481, i32 23
>   %505 = extractelement <33 x float> %481, i32 24
>   %506 = extractelement <33 x float> %481, i32 25
>   %507 = extractelement <33 x float> %481, i32 26
>   %508 = extractelement <33 x float> %481, i32 27
>   %509 = extractelement <33 x float> %481, i32 28
>   %510 = extractelement <33 x float> %481, i32 29
>   %511 = extractelement <33 x float> %481, i32 30
>   %512 = extractelement <33 x float> %481, i32 31
>   %513 = extractelement <33 x float> %481, i32 32
>   %514 = add i32 %377, 1
>   %array_vector429 = insertelement <33 x float> undef, float %236, i32 0
>   %array_vector461 = shufflevector <33 x float> %array_vector429, <33 x float> <float undef, float 0x3FF19999A0000000, float 0x4000CCCCC0000000, float 0x4008CCCCC0000000, float 0x4010666660000000, float 0x4014666660000000, float 0x4018666660000000, float 0x401C666660000000, float 0x4020333340000000, float 0x4022333340000000, float 0x4024333340000000, float 0x4026333340000000, float 0x4028333340000000, float 0x402A333340000000, float 0x402C333340000000, float 0x402E333340000000, float 0x40301999A0000000, float 0x40311999A0000000, float 0x40321999A0000000, float 0x40331999A0000000, float 0x40341999A0000000, float 0x40351999A0000000, float 0x40361999A0000000, float 0x40371999A0000000, float 0x40381999A0000000, float 0x40391999A0000000, float 0x403A1999A0000000, float 0x403B1999A0000000, float 0x403C1999A0000000, float 0x403D1999A0000000, float 0x403E1999A0000000, float 0x403F1999A0000000, float 0x40400CCCC0000000>, <33 x i32> <i32 0, i32 34, i32 35, i32 36, i32 37, i32 38, i32 39, i32 40, i32 41, i32 42, i32 43, i32 44, i32 45, i32 46, i32 47, i32 48, i32 49, i32 50, i32 51, i32 52, i32 53, i32 54, i32 55, i32 56, i32 57, i32 58, i32 59, i32 60, i32 61, i32 62, i32 63, i32 64, i32 65>
>   %515 = insertelement <33 x float> %array_vector461, float 0x3FB99999A0000000, i32 %514
>   %516 = extractelement <33 x float> %515, i32 1
>   %517 = extractelement <33 x float> %515, i32 2
>   %518 = extractelement <33 x float> %515, i32 3
>   %519 = extractelement <33 x float> %515, i32 4
>   %520 = extractelement <33 x float> %515, i32 5
>   %521 = extractelement <33 x float> %515, i32 6
>   %522 = extractelement <33 x float> %515, i32 7
>   %523 = extractelement <33 x float> %515, i32 8
>   %524 = extractelement <33 x float> %515, i32 9
>   %525 = extractelement <33 x float> %515, i32 10
>   %526 = extractelement <33 x float> %515, i32 11
>   %527 = extractelement <33 x float> %515, i32 12
>   %528 = extractelement <33 x float> %515, i32 13
>   %529 = extractelement <33 x float> %515, i32 14
>   %530 = extractelement <33 x float> %515, i32 15
>   %531 = extractelement <33 x float> %515, i32 16
>   %532 = extractelement <33 x float> %515, i32 17
>   %533 = extractelement <33 x float> %515, i32 18
>   %534 = extractelement <33 x float> %515, i32 19
>   %535 = extractelement <33 x float> %515, i32 20
>   %536 = extractelement <33 x float> %515, i32 21
>   %537 = extractelement <33 x float> %515, i32 22
>   %538 = extractelement <33 x float> %515, i32 23
>   %539 = extractelement <33 x float> %515, i32 24
>   %540 = extractelement <33 x float> %515, i32 25
>   %541 = extractelement <33 x float> %515, i32 26
>   %542 = extractelement <33 x float> %515, i32 27
>   %543 = extractelement <33 x float> %515, i32 28
>   %544 = extractelement <33 x float> %515, i32 29
>   %545 = extractelement <33 x float> %515, i32 30
>   %546 = extractelement <33 x float> %515, i32 31
>   %547 = extractelement <33 x float> %515, i32 32
>   %548 = add i32 %377, 1
>   %array_vector462 = insertelement <33 x float> undef, float %239, i32 0
>   %array_vector494 = shufflevector <33 x float> %array_vector462, <33 x float> <float undef, float 0x3FF3333340000000, float 0x40019999A0000000, float 0x40099999A0000000, float 0x4010CCCCC0000000, float 0x4014CCCCC0000000, float 0x4018CCCCC0000000, float 0x401CCCCCC0000000, float 0x4020666660000000, float 0x4022666660000000, float 0x4024666660000000, float 0x4026666660000000, float 0x4028666660000000, float 0x402A666660000000, float 0x402C666660000000, float 0x402E666660000000, float 0x4030333340000000, float 0x4031333340000000, float 0x4032333340000000, float 0x4033333340000000, float 0x4034333340000000, float 0x4035333340000000, float 0x4036333340000000, float 0x4037333340000000, float 0x4038333340000000, float 0x4039333340000000, float 0x403A333340000000, float 0x403B333340000000, float 0x403C333340000000, float 0x403D333340000000, float 0x403E333340000000, float 0x403F333340000000, float 0x40401999A0000000>, <33 x i32> <i32 0, i32 34, i32 35, i32 36, i32 37, i32 38, i32 39, i32 40, i32 41, i32 42, i32 43, i32 44, i32 45, i32 46, i32 47, i32 48, i32 49, i32 50, i32 51, i32 52, i32 53, i32 54, i32 55, i32 56, i32 57, i32 58, i32 59, i32 60, i32 61, i32 62, i32 63, i32 64, i32 65>
>   %549 = insertelement <33 x float> %array_vector494, float 0x3FC99999A0000000, i32 %548
>   %550 = extractelement <33 x float> %549, i32 1
>   %551 = extractelement <33 x float> %549, i32 2
>   %552 = extractelement <33 x float> %549, i32 3
>   %553 = extractelement <33 x float> %549, i32 4
>   %554 = extractelement <33 x float> %549, i32 5
>   %555 = extractelement <33 x float> %549, i32 6
>   %556 = extractelement <33 x float> %549, i32 7
>   %557 = extractelement <33 x float> %549, i32 8
>   %558 = extractelement <33 x float> %549, i32 9
>   %559 = extractelement <33 x float> %549, i32 10
>   %560 = extractelement <33 x float> %549, i32 11
>   %561 = extractelement <33 x float> %549, i32 12
>   %562 = extractelement <33 x float> %549, i32 13
>   %563 = extractelement <33 x float> %549, i32 14
>   %564 = extractelement <33 x float> %549, i32 15
>   %565 = extractelement <33 x float> %549, i32 16
>   %566 = extractelement <33 x float> %549, i32 17
>   %567 = extractelement <33 x float> %549, i32 18
>   %568 = extractelement <33 x float> %549, i32 19
>   %569 = extractelement <33 x float> %549, i32 20
>   %570 = extractelement <33 x float> %549, i32 21
>   %571 = extractelement <33 x float> %549, i32 22
>   %572 = extractelement <33 x float> %549, i32 23
>   %573 = extractelement <33 x float> %549, i32 24
>   %574 = extractelement <33 x float> %549, i32 25
>   %575 = extractelement <33 x float> %549, i32 26
>   %576 = extractelement <33 x float> %549, i32 27
>   %577 = extractelement <33 x float> %549, i32 28
>   %578 = extractelement <33 x float> %549, i32 29
>   %579 = extractelement <33 x float> %549, i32 30
>   %580 = extractelement <33 x float> %549, i32 31
>   %581 = extractelement <33 x float> %549, i32 32
>   br label %endif83
>
> endif83:                                          ; preds = %else80, %if76
>   %OUT1.y.1 = phi float [ %516, %else80 ], [ %414, %if76 ]
>   %OUT1.z.1 = phi float [ %550, %else80 ], [ %448, %if76 ]
>   %OUT2.x.1 = phi float [ %483, %else80 ], [ %381, %if76 ]
>   %OUT2.y.1 = phi float [ %517, %else80 ], [ %415, %if76 ]
>   %OUT2.z.1 = phi float [ %551, %else80 ], [ %449, %if76 ]
>   %OUT3.x.1 = phi float [ %484, %else80 ], [ %382, %if76 ]
>   %OUT3.y.1 = phi float [ %518, %else80 ], [ %416, %if76 ]
>   %OUT3.z.1 = phi float [ %552, %else80 ], [ %450, %if76 ]
>   %OUT4.x.1 = phi float [ %485, %else80 ], [ %383, %if76 ]
>   %OUT4.y.1 = phi float [ %519, %else80 ], [ %417, %if76 ]
>   %OUT4.z.1 = phi float [ %553, %else80 ], [ %451, %if76 ]
>   %OUT5.x.1 = phi float [ %486, %else80 ], [ %384, %if76 ]
>   %OUT5.y.1 = phi float [ %520, %else80 ], [ %418, %if76 ]
>   %OUT5.z.1 = phi float [ %554, %else80 ], [ %452, %if76 ]
>   %OUT6.x.1 = phi float [ %487, %else80 ], [ %385, %if76 ]
>   %OUT6.y.1 = phi float [ %521, %else80 ], [ %419, %if76 ]
>   %OUT6.z.1 = phi float [ %555, %else80 ], [ %453, %if76 ]
>   %OUT7.x.1 = phi float [ %488, %else80 ], [ %386, %if76 ]
>   %OUT7.y.1 = phi float [ %522, %else80 ], [ %420, %if76 ]
>   %OUT7.z.1 = phi float [ %556, %else80 ], [ %454, %if76 ]
>   %OUT8.x.1 = phi float [ %489, %else80 ], [ %387, %if76 ]
>   %OUT8.y.1 = phi float [ %523, %else80 ], [ %421, %if76 ]
>   %OUT8.z.1 = phi float [ %557, %else80 ], [ %455, %if76 ]
>   %OUT9.x.1 = phi float [ %490, %else80 ], [ %388, %if76 ]
>   %OUT9.y.1 = phi float [ %524, %else80 ], [ %422, %if76 ]
>   %OUT9.z.1 = phi float [ %558, %else80 ], [ %456, %if76 ]
>   %OUT10.x.1 = phi float [ %491, %else80 ], [ %389, %if76 ]
>   %OUT10.y.1 = phi float [ %525, %else80 ], [ %423, %if76 ]
>   %OUT10.z.1 = phi float [ %559, %else80 ], [ %457, %if76 ]
>   %OUT11.x.1 = phi float [ %492, %else80 ], [ %390, %if76 ]
>   %OUT11.y.1 = phi float [ %526, %else80 ], [ %424, %if76 ]
>   %OUT11.z.1 = phi float [ %560, %else80 ], [ %458, %if76 ]
>   %OUT12.x.1 = phi float [ %493, %else80 ], [ %391, %if76 ]
>   %OUT12.y.1 = phi float [ %527, %else80 ], [ %425, %if76 ]
>   %OUT12.z.1 = phi float [ %561, %else80 ], [ %459, %if76 ]
>   %OUT13.x.1 = phi float [ %494, %else80 ], [ %392, %if76 ]
>   %OUT13.y.1 = phi float [ %528, %else80 ], [ %426, %if76 ]
>   %OUT13.z.1 = phi float [ %562, %else80 ], [ %460, %if76 ]
>   %OUT14.x.1 = phi float [ %495, %else80 ], [ %393, %if76 ]
>   %OUT14.y.1 = phi float [ %529, %else80 ], [ %427, %if76 ]
>   %OUT14.z.1 = phi float [ %563, %else80 ], [ %461, %if76 ]
>   %OUT15.x.1 = phi float [ %496, %else80 ], [ %394, %if76 ]
>   %OUT15.y.1 = phi float [ %530, %else80 ], [ %428, %if76 ]
>   %OUT15.z.1 = phi float [ %564, %else80 ], [ %462, %if76 ]
>   %OUT16.x.1 = phi float [ %497, %else80 ], [ %395, %if76 ]
>   %OUT16.y.1 = phi float [ %531, %else80 ], [ %429, %if76 ]
>   %OUT16.z.1 = phi float [ %565, %else80 ], [ %463, %if76 ]
>   %OUT17.x.1 = phi float [ %498, %else80 ], [ %396, %if76 ]
>   %OUT17.y.1 = phi float [ %532, %else80 ], [ %430, %if76 ]
>   %OUT17.z.1 = phi float [ %566, %else80 ], [ %464, %if76 ]
>   %OUT18.x.1 = phi float [ %499, %else80 ], [ %397, %if76 ]
>   %OUT18.y.1 = phi float [ %533, %else80 ], [ %431, %if76 ]
>   %OUT18.z.1 = phi float [ %567, %else80 ], [ %465, %if76 ]
>   %OUT19.x.1 = phi float [ %500, %else80 ], [ %398, %if76 ]
>   %OUT19.y.1 = phi float [ %534, %else80 ], [ %432, %if76 ]
>   %OUT19.z.1 = phi float [ %568, %else80 ], [ %466, %if76 ]
>   %OUT20.x.1 = phi float [ %501, %else80 ], [ %399, %if76 ]
>   %OUT20.y.1 = phi float [ %535, %else80 ], [ %433, %if76 ]
>   %OUT20.z.1 = phi float [ %569, %else80 ], [ %467, %if76 ]
>   %OUT21.x.1 = phi float [ %502, %else80 ], [ %400, %if76 ]
>   %OUT21.y.1 = phi float [ %536, %else80 ], [ %434, %if76 ]
>   %OUT21.z.1 = phi float [ %570, %else80 ], [ %468, %if76 ]
>   %OUT22.x.1 = phi float [ %503, %else80 ], [ %401, %if76 ]
>   %OUT22.y.1 = phi float [ %537, %else80 ], [ %435, %if76 ]
>   %OUT22.z.1 = phi float [ %571, %else80 ], [ %469, %if76 ]
>   %OUT23.x.1 = phi float [ %504, %else80 ], [ %402, %if76 ]
>   %OUT23.y.1 = phi float [ %538, %else80 ], [ %436, %if76 ]
>   %OUT23.z.1 = phi float [ %572, %else80 ], [ %470, %if76 ]
>   %OUT24.x.1 = phi float [ %505, %else80 ], [ %403, %if76 ]
>   %OUT24.y.1 = phi float [ %539, %else80 ], [ %437, %if76 ]
>   %OUT24.z.1 = phi float [ %573, %else80 ], [ %471, %if76 ]
>   %OUT25.x.1 = phi float [ %506, %else80 ], [ %404, %if76 ]
>   %OUT25.y.1 = phi float [ %540, %else80 ], [ %438, %if76 ]
>   %OUT25.z.1 = phi float [ %574, %else80 ], [ %472, %if76 ]
>   %OUT26.x.1 = phi float [ %507, %else80 ], [ %405, %if76 ]
>   %OUT26.y.1 = phi float [ %541, %else80 ], [ %439, %if76 ]
>   %OUT26.z.1 = phi float [ %575, %else80 ], [ %473, %if76 ]
>   %OUT27.x.1 = phi float [ %508, %else80 ], [ %406, %if76 ]
>   %OUT27.y.1 = phi float [ %542, %else80 ], [ %440, %if76 ]
>   %OUT27.z.1 = phi float [ %576, %else80 ], [ %474, %if76 ]
>   %OUT28.x.1 = phi float [ %509, %else80 ], [ %407, %if76 ]
>   %OUT28.y.1 = phi float [ %543, %else80 ], [ %441, %if76 ]
>   %OUT28.z.1 = phi float [ %577, %else80 ], [ %475, %if76 ]
>   %OUT29.x.1 = phi float [ %510, %else80 ], [ %408, %if76 ]
>   %OUT29.y.1 = phi float [ %544, %else80 ], [ %442, %if76 ]
>   %OUT29.z.1 = phi float [ %578, %else80 ], [ %476, %if76 ]
>   %OUT30.x.1 = phi float [ %511, %else80 ], [ %409, %if76 ]
>   %OUT30.y.1 = phi float [ %545, %else80 ], [ %443, %if76 ]
>   %OUT30.z.1 = phi float [ %579, %else80 ], [ %477, %if76 ]
>   %OUT31.x.1 = phi float [ %512, %else80 ], [ %410, %if76 ]
>   %OUT31.y.1 = phi float [ %546, %else80 ], [ %444, %if76 ]
>   %OUT31.z.1 = phi float [ %580, %else80 ], [ %478, %if76 ]
>   %OUT32.x.1 = phi float [ %513, %else80 ], [ %411, %if76 ]
>   %OUT32.y.1 = phi float [ %547, %else80 ], [ %445, %if76 ]
>   %OUT32.z.1 = phi float [ %581, %else80 ], [ %479, %if76 ]
>   %OUT1.x.1 = phi float [ %482, %else80 ], [ %380, %if76 ]
>   %.sink885 = add i32 %377, 1
>   %array_vector495 = insertelement <33 x float> undef, float %242, i32 0
>   %array_vector527 = shufflevector <33 x float> %array_vector495, <33 x float> <float undef, float 0x3FF4CCCCC0000000, float 0x4002666660000000, float 0x400A666660000000, float 0x4011333340000000, float 0x4015333340000000, float 0x4019333340000000, float 0x401D333340000000, float 0x40209999A0000000, float 0x40229999A0000000, float 0x40249999A0000000, float 0x40269999A0000000, float 0x40289999A0000000, float 0x402A9999A0000000, float 0x402C9999A0000000, float 0x402E9999A0000000, float 0x40304CCCC0000000, float 0x40314CCCC0000000, float 0x40324CCCC0000000, float 0x40334CCCC0000000, float 0x40344CCCC0000000, float 0x40354CCCC0000000, float 0x40364CCCC0000000, float 0x40374CCCC0000000, float 0x40384CCCC0000000, float 0x40394CCCC0000000, float 0x403A4CCCC0000000, float 0x403B4CCCC0000000, float 0x403C4CCCC0000000, float 0x403D4CCCC0000000, float 0x403E4CCCC0000000, float 0x403F4CCCC0000000, float 0x4040266660000000>, <33 x i32> <i32 0, i32 34, i32 35, i32 36, i32 37, i32 38, i32 39, i32 40, i32 41, i32 42, i32 43, i32 44, i32 45, i32 46, i32 47, i32 48, i32 49, i32 50, i32 51, i32 52, i32 53, i32 54, i32 55, i32 56, i32 57, i32 58, i32 59, i32 60, i32 61, i32 62, i32 63, i32 64, i32 65>
>   %582 = insertelement <33 x float> %array_vector527, float 0x3FD3333340000000, i32 %.sink885
>   %583 = shl i32 %8, 2
>   %584 = call i32 @llvm.SI.buffer.load.dword.i32.i32(<16 x i8> %16, i32 %583, i32 0, i32 0, i32 1, i32 0, i32 1, i32 0, i32 0)
>   %585 = bitcast i32 %584 to float
>   %586 = shl i32 %8, 2
>   %587 = call i32 @llvm.SI.buffer.load.dword.i32.i32(<16 x i8> %16, i32 %586, i32 256, i32 0, i32 1, i32 0, i32 1, i32 0, i32 0)
>   %588 = bitcast i32 %587 to float
>   %589 = shl i32 %8, 2
>   %590 = call i32 @llvm.SI.buffer.load.dword.i32.i32(<16 x i8> %16, i32 %589, i32 512, i32 0, i32 1, i32 0, i32 1, i32 0, i32 0)
>   %591 = bitcast i32 %590 to float
>   %592 = shl i32 %8, 2
>   %593 = call i32 @llvm.SI.buffer.load.dword.i32.i32(<16 x i8> %16, i32 %592, i32 768, i32 0, i32 1, i32 0, i32 1, i32 0, i32 0)
>   %594 = bitcast i32 %593 to float
>   call void @llvm.AMDGPU.kill(float 1.000000e+00)
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %584, i32 1, i32 4, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %587, i32 1, i32 16, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %590, i32 1, i32 28, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %593, i32 1, i32 40, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %595 = bitcast float %OUT1.x.1 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %595, i32 1, i32 52, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %596 = bitcast float %OUT1.y.1 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %596, i32 1, i32 64, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %597 = bitcast float %OUT1.z.1 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %597, i32 1, i32 76, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %bc918 = bitcast <33 x float> %582 to <33 x i32>
>   %598 = extractelement <33 x i32> %bc918, i32 1
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %598, i32 1, i32 88, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %599 = bitcast float %OUT2.x.1 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %599, i32 1, i32 100, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %600 = bitcast float %OUT2.y.1 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %600, i32 1, i32 112, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %601 = bitcast float %OUT2.z.1 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %601, i32 1, i32 124, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %bc919 = bitcast <33 x float> %582 to <33 x i32>
>   %602 = extractelement <33 x i32> %bc919, i32 2
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %602, i32 1, i32 136, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %603 = bitcast float %OUT3.x.1 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %603, i32 1, i32 148, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %604 = bitcast float %OUT3.y.1 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %604, i32 1, i32 160, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %605 = bitcast float %OUT3.z.1 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %605, i32 1, i32 172, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %bc920 = bitcast <33 x float> %582 to <33 x i32>
>   %606 = extractelement <33 x i32> %bc920, i32 3
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %606, i32 1, i32 184, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %607 = bitcast float %OUT4.x.1 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %607, i32 1, i32 196, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %608 = bitcast float %OUT4.y.1 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %608, i32 1, i32 208, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %609 = bitcast float %OUT4.z.1 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %609, i32 1, i32 220, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %bc921 = bitcast <33 x float> %582 to <33 x i32>
>   %610 = extractelement <33 x i32> %bc921, i32 4
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %610, i32 1, i32 232, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %611 = bitcast float %OUT5.x.1 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %611, i32 1, i32 244, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %612 = bitcast float %OUT5.y.1 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %612, i32 1, i32 256, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %613 = bitcast float %OUT5.z.1 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %613, i32 1, i32 268, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %bc922 = bitcast <33 x float> %582 to <33 x i32>
>   %614 = extractelement <33 x i32> %bc922, i32 5
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %614, i32 1, i32 280, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %615 = bitcast float %OUT6.x.1 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %615, i32 1, i32 292, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %616 = bitcast float %OUT6.y.1 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %616, i32 1, i32 304, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %617 = bitcast float %OUT6.z.1 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %617, i32 1, i32 316, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %bc923 = bitcast <33 x float> %582 to <33 x i32>
>   %618 = extractelement <33 x i32> %bc923, i32 6
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %618, i32 1, i32 328, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %619 = bitcast float %OUT7.x.1 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %619, i32 1, i32 340, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %620 = bitcast float %OUT7.y.1 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %620, i32 1, i32 352, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %621 = bitcast float %OUT7.z.1 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %621, i32 1, i32 364, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %bc924 = bitcast <33 x float> %582 to <33 x i32>
>   %622 = extractelement <33 x i32> %bc924, i32 7
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %622, i32 1, i32 376, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %623 = bitcast float %OUT8.x.1 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %623, i32 1, i32 388, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %624 = bitcast float %OUT8.y.1 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %624, i32 1, i32 400, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %625 = bitcast float %OUT8.z.1 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %625, i32 1, i32 412, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %bc925 = bitcast <33 x float> %582 to <33 x i32>
>   %626 = extractelement <33 x i32> %bc925, i32 8
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %626, i32 1, i32 424, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %627 = bitcast float %OUT9.x.1 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %627, i32 1, i32 436, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %628 = bitcast float %OUT9.y.1 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %628, i32 1, i32 448, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %629 = bitcast float %OUT9.z.1 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %629, i32 1, i32 460, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %bc926 = bitcast <33 x float> %582 to <33 x i32>
>   %630 = extractelement <33 x i32> %bc926, i32 9
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %630, i32 1, i32 472, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %631 = bitcast float %OUT10.x.1 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %631, i32 1, i32 484, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %632 = bitcast float %OUT10.y.1 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %632, i32 1, i32 496, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %633 = bitcast float %OUT10.z.1 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %633, i32 1, i32 508, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %bc927 = bitcast <33 x float> %582 to <33 x i32>
>   %634 = extractelement <33 x i32> %bc927, i32 10
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %634, i32 1, i32 520, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %635 = bitcast float %OUT11.x.1 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %635, i32 1, i32 532, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %636 = bitcast float %OUT11.y.1 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %636, i32 1, i32 544, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %637 = bitcast float %OUT11.z.1 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %637, i32 1, i32 556, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %bc928 = bitcast <33 x float> %582 to <33 x i32>
>   %638 = extractelement <33 x i32> %bc928, i32 11
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %638, i32 1, i32 568, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %639 = bitcast float %OUT12.x.1 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %639, i32 1, i32 580, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %640 = bitcast float %OUT12.y.1 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %640, i32 1, i32 592, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %641 = bitcast float %OUT12.z.1 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %641, i32 1, i32 604, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %bc929 = bitcast <33 x float> %582 to <33 x i32>
>   %642 = extractelement <33 x i32> %bc929, i32 12
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %642, i32 1, i32 616, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %643 = bitcast float %OUT13.x.1 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %643, i32 1, i32 628, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %644 = bitcast float %OUT13.y.1 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %644, i32 1, i32 640, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %645 = bitcast float %OUT13.z.1 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %645, i32 1, i32 652, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %bc930 = bitcast <33 x float> %582 to <33 x i32>
>   %646 = extractelement <33 x i32> %bc930, i32 13
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %646, i32 1, i32 664, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %647 = bitcast float %OUT14.x.1 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %647, i32 1, i32 676, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %648 = bitcast float %OUT14.y.1 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %648, i32 1, i32 688, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %649 = bitcast float %OUT14.z.1 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %649, i32 1, i32 700, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %bc931 = bitcast <33 x float> %582 to <33 x i32>
>   %650 = extractelement <33 x i32> %bc931, i32 14
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %650, i32 1, i32 712, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %651 = bitcast float %OUT15.x.1 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %651, i32 1, i32 724, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %652 = bitcast float %OUT15.y.1 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %652, i32 1, i32 736, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %653 = bitcast float %OUT15.z.1 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %653, i32 1, i32 748, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %bc932 = bitcast <33 x float> %582 to <33 x i32>
>   %654 = extractelement <33 x i32> %bc932, i32 15
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %654, i32 1, i32 760, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %655 = bitcast float %OUT16.x.1 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %655, i32 1, i32 772, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %656 = bitcast float %OUT16.y.1 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %656, i32 1, i32 784, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %657 = bitcast float %OUT16.z.1 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %657, i32 1, i32 796, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %bc933 = bitcast <33 x float> %582 to <33 x i32>
>   %658 = extractelement <33 x i32> %bc933, i32 16
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %658, i32 1, i32 808, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %659 = bitcast float %OUT17.x.1 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %659, i32 1, i32 820, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %660 = bitcast float %OUT17.y.1 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %660, i32 1, i32 832, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %661 = bitcast float %OUT17.z.1 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %661, i32 1, i32 844, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %bc934 = bitcast <33 x float> %582 to <33 x i32>
>   %662 = extractelement <33 x i32> %bc934, i32 17
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %662, i32 1, i32 856, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %663 = bitcast float %OUT18.x.1 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %663, i32 1, i32 868, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %664 = bitcast float %OUT18.y.1 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %664, i32 1, i32 880, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %665 = bitcast float %OUT18.z.1 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %665, i32 1, i32 892, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %bc935 = bitcast <33 x float> %582 to <33 x i32>
>   %666 = extractelement <33 x i32> %bc935, i32 18
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %666, i32 1, i32 904, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %667 = bitcast float %OUT19.x.1 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %667, i32 1, i32 916, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %668 = bitcast float %OUT19.y.1 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %668, i32 1, i32 928, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %669 = bitcast float %OUT19.z.1 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %669, i32 1, i32 940, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %bc936 = bitcast <33 x float> %582 to <33 x i32>
>   %670 = extractelement <33 x i32> %bc936, i32 19
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %670, i32 1, i32 952, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %671 = bitcast float %OUT20.x.1 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %671, i32 1, i32 964, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %672 = bitcast float %OUT20.y.1 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %672, i32 1, i32 976, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %673 = bitcast float %OUT20.z.1 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %673, i32 1, i32 988, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %bc937 = bitcast <33 x float> %582 to <33 x i32>
>   %674 = extractelement <33 x i32> %bc937, i32 20
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %674, i32 1, i32 1000, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %675 = bitcast float %OUT21.x.1 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %675, i32 1, i32 1012, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %676 = bitcast float %OUT21.y.1 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %676, i32 1, i32 1024, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %677 = bitcast float %OUT21.z.1 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %677, i32 1, i32 1036, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %bc938 = bitcast <33 x float> %582 to <33 x i32>
>   %678 = extractelement <33 x i32> %bc938, i32 21
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %678, i32 1, i32 1048, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %679 = bitcast float %OUT22.x.1 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %679, i32 1, i32 1060, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %680 = bitcast float %OUT22.y.1 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %680, i32 1, i32 1072, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %681 = bitcast float %OUT22.z.1 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %681, i32 1, i32 1084, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %bc939 = bitcast <33 x float> %582 to <33 x i32>
>   %682 = extractelement <33 x i32> %bc939, i32 22
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %682, i32 1, i32 1096, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %683 = bitcast float %OUT23.x.1 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %683, i32 1, i32 1108, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %684 = bitcast float %OUT23.y.1 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %684, i32 1, i32 1120, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %685 = bitcast float %OUT23.z.1 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %685, i32 1, i32 1132, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %bc940 = bitcast <33 x float> %582 to <33 x i32>
>   %686 = extractelement <33 x i32> %bc940, i32 23
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %686, i32 1, i32 1144, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %687 = bitcast float %OUT24.x.1 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %687, i32 1, i32 1156, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %688 = bitcast float %OUT24.y.1 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %688, i32 1, i32 1168, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %689 = bitcast float %OUT24.z.1 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %689, i32 1, i32 1180, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %bc941 = bitcast <33 x float> %582 to <33 x i32>
>   %690 = extractelement <33 x i32> %bc941, i32 24
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %690, i32 1, i32 1192, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %691 = bitcast float %OUT25.x.1 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %691, i32 1, i32 1204, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %692 = bitcast float %OUT25.y.1 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %692, i32 1, i32 1216, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %693 = bitcast float %OUT25.z.1 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %693, i32 1, i32 1228, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %bc942 = bitcast <33 x float> %582 to <33 x i32>
>   %694 = extractelement <33 x i32> %bc942, i32 25
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %694, i32 1, i32 1240, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %695 = bitcast float %OUT26.x.1 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %695, i32 1, i32 1252, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %696 = bitcast float %OUT26.y.1 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %696, i32 1, i32 1264, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %697 = bitcast float %OUT26.z.1 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %697, i32 1, i32 1276, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %bc943 = bitcast <33 x float> %582 to <33 x i32>
>   %698 = extractelement <33 x i32> %bc943, i32 26
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %698, i32 1, i32 1288, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %699 = bitcast float %OUT27.x.1 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %699, i32 1, i32 1300, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %700 = bitcast float %OUT27.y.1 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %700, i32 1, i32 1312, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %701 = bitcast float %OUT27.z.1 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %701, i32 1, i32 1324, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %bc944 = bitcast <33 x float> %582 to <33 x i32>
>   %702 = extractelement <33 x i32> %bc944, i32 27
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %702, i32 1, i32 1336, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %703 = bitcast float %OUT28.x.1 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %703, i32 1, i32 1348, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %704 = bitcast float %OUT28.y.1 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %704, i32 1, i32 1360, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %705 = bitcast float %OUT28.z.1 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %705, i32 1, i32 1372, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %bc945 = bitcast <33 x float> %582 to <33 x i32>
>   %706 = extractelement <33 x i32> %bc945, i32 28
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %706, i32 1, i32 1384, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %707 = bitcast float %OUT29.x.1 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %707, i32 1, i32 1396, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %708 = bitcast float %OUT29.y.1 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %708, i32 1, i32 1408, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %709 = bitcast float %OUT29.z.1 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %709, i32 1, i32 1420, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %bc946 = bitcast <33 x float> %582 to <33 x i32>
>   %710 = extractelement <33 x i32> %bc946, i32 29
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %710, i32 1, i32 1432, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %711 = bitcast float %OUT30.x.1 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %711, i32 1, i32 1444, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %712 = bitcast float %OUT30.y.1 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %712, i32 1, i32 1456, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %713 = bitcast float %OUT30.z.1 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %713, i32 1, i32 1468, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %bc947 = bitcast <33 x float> %582 to <33 x i32>
>   %714 = extractelement <33 x i32> %bc947, i32 30
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %714, i32 1, i32 1480, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %715 = bitcast float %OUT31.x.1 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %715, i32 1, i32 1492, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %716 = bitcast float %OUT31.y.1 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %716, i32 1, i32 1504, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %717 = bitcast float %OUT31.z.1 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %717, i32 1, i32 1516, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %bc948 = bitcast <33 x float> %582 to <33 x i32>
>   %718 = extractelement <33 x i32> %bc948, i32 31
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %718, i32 1, i32 1528, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %719 = bitcast float %OUT32.x.1 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %719, i32 1, i32 1540, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %720 = bitcast float %OUT32.y.1 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %720, i32 1, i32 1552, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %721 = bitcast float %OUT32.z.1 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %721, i32 1, i32 1564, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %bc949 = bitcast <33 x float> %582 to <33 x i32>
>   %722 = extractelement <33 x i32> %bc949, i32 32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %722, i32 1, i32 1576, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   call void @llvm.SI.sendmsg(i32 34, i32 %6)
>   %723 = getelementptr [16 x <16 x i8>], [16 x <16 x i8>] addrspace(2)* %1, i64 0, i64 0, !amdgpu.uniform !0
>   %724 = load <16 x i8>, <16 x i8> addrspace(2)* %723, align 16, !invariant.load !0
>   %725 = call float @llvm.SI.load.const(<16 x i8> %724, i32 0)
>   %726 = bitcast float %725 to i32
>   %727 = icmp sgt i32 %726, 15
>   %728 = call float @llvm.SI.load.const(<16 x i8> %724, i32 0)
>   %729 = bitcast float %728 to i32
>   br i1 %727, label %if119, label %else123
>
> if119:                                            ; preds = %endif83
>   %730 = add i32 %729, 1
>   %array_vector528 = insertelement <33 x float> undef, float %585, i32 0
>   %array_vector560 = shufflevector <33 x float> %array_vector528, <33 x float> <float undef, float 1.000000e+00, float 2.000000e+00, float 3.000000e+00, float 4.000000e+00, float 5.000000e+00, float 6.000000e+00, float 7.000000e+00, float 8.000000e+00, float 9.000000e+00, float 1.000000e+01, float 1.100000e+01, float 1.200000e+01, float 1.300000e+01, float 1.400000e+01, float 1.500000e+01, float 1.600000e+01, float 1.700000e+01, float 1.800000e+01, float 1.900000e+01, float 2.000000e+01, float 2.100000e+01, float 2.200000e+01, float 2.300000e+01, float 2.400000e+01, float 2.500000e+01, float 2.600000e+01, float 2.700000e+01, float 2.800000e+01, float 2.900000e+01, float 3.000000e+01, float 3.100000e+01, float 3.200000e+01>, <33 x i32> <i32 0, i32 34, i32 35, i32 36, i32 37, i32 38, i32 39, i32 40, i32 41, i32 42, i32 43, i32 44, i32 45, i32 46, i32 47, i32 48, i32 49, i32 50, i32 51, i32 52, i32 53, i32 54, i32 55, i32 56, i32 57, i32 58, i32 59, i32 60, i32 61, i32 62, i32 63, i32 64, i32 65>
>   %731 = insertelement <33 x float> %array_vector560, float 0.000000e+00, i32 %730
>   %732 = extractelement <33 x float> %731, i32 1
>   %733 = extractelement <33 x float> %731, i32 2
>   %734 = extractelement <33 x float> %731, i32 3
>   %735 = extractelement <33 x float> %731, i32 4
>   %736 = extractelement <33 x float> %731, i32 5
>   %737 = extractelement <33 x float> %731, i32 6
>   %738 = extractelement <33 x float> %731, i32 7
>   %739 = extractelement <33 x float> %731, i32 8
>   %740 = extractelement <33 x float> %731, i32 9
>   %741 = extractelement <33 x float> %731, i32 10
>   %742 = extractelement <33 x float> %731, i32 11
>   %743 = extractelement <33 x float> %731, i32 12
>   %744 = extractelement <33 x float> %731, i32 13
>   %745 = extractelement <33 x float> %731, i32 14
>   %746 = extractelement <33 x float> %731, i32 15
>   %747 = extractelement <33 x float> %731, i32 16
>   %748 = extractelement <33 x float> %731, i32 17
>   %749 = extractelement <33 x float> %731, i32 18
>   %750 = extractelement <33 x float> %731, i32 19
>   %751 = extractelement <33 x float> %731, i32 20
>   %752 = extractelement <33 x float> %731, i32 21
>   %753 = extractelement <33 x float> %731, i32 22
>   %754 = extractelement <33 x float> %731, i32 23
>   %755 = extractelement <33 x float> %731, i32 24
>   %756 = extractelement <33 x float> %731, i32 25
>   %757 = extractelement <33 x float> %731, i32 26
>   %758 = extractelement <33 x float> %731, i32 27
>   %759 = extractelement <33 x float> %731, i32 28
>   %760 = extractelement <33 x float> %731, i32 29
>   %761 = extractelement <33 x float> %731, i32 30
>   %762 = extractelement <33 x float> %731, i32 31
>   %763 = extractelement <33 x float> %731, i32 32
>   %764 = add i32 %729, 1
>   %array_vector561 = insertelement <33 x float> undef, float %588, i32 0
>   %array_vector593 = shufflevector <33 x float> %array_vector561, <33 x float> <float undef, float 0x3FF19999A0000000, float 0x4000CCCCC0000000, float 0x4008CCCCC0000000, float 0x4010666660000000, float 0x4014666660000000, float 0x4018666660000000, float 0x401C666660000000, float 0x4020333340000000, float 0x4022333340000000, float 0x4024333340000000, float 0x4026333340000000, float 0x4028333340000000, float 0x402A333340000000, float 0x402C333340000000, float 0x402E333340000000, float 0x40301999A0000000, float 0x40311999A0000000, float 0x40321999A0000000, float 0x40331999A0000000, float 0x40341999A0000000, float 0x40351999A0000000, float 0x40361999A0000000, float 0x40371999A0000000, float 0x40381999A0000000, float 0x40391999A0000000, float 0x403A1999A0000000, float 0x403B1999A0000000, float 0x403C1999A0000000, float 0x403D1999A0000000, float 0x403E1999A0000000, float 0x403F1999A0000000, float 0x40400CCCC0000000>, <33 x i32> <i32 0, i32 34, i32 35, i32 36, i32 37, i32 38, i32 39, i32 40, i32 41, i32 42, i32 43, i32 44, i32 45, i32 46, i32 47, i32 48, i32 49, i32 50, i32 51, i32 52, i32 53, i32 54, i32 55, i32 56, i32 57, i32 58, i32 59, i32 60, i32 61, i32 62, i32 63, i32 64, i32 65>
>   %765 = insertelement <33 x float> %array_vector593, float 0x3FB99999A0000000, i32 %764
>   %766 = extractelement <33 x float> %765, i32 1
>   %767 = extractelement <33 x float> %765, i32 2
>   %768 = extractelement <33 x float> %765, i32 3
>   %769 = extractelement <33 x float> %765, i32 4
>   %770 = extractelement <33 x float> %765, i32 5
>   %771 = extractelement <33 x float> %765, i32 6
>   %772 = extractelement <33 x float> %765, i32 7
>   %773 = extractelement <33 x float> %765, i32 8
>   %774 = extractelement <33 x float> %765, i32 9
>   %775 = extractelement <33 x float> %765, i32 10
>   %776 = extractelement <33 x float> %765, i32 11
>   %777 = extractelement <33 x float> %765, i32 12
>   %778 = extractelement <33 x float> %765, i32 13
>   %779 = extractelement <33 x float> %765, i32 14
>   %780 = extractelement <33 x float> %765, i32 15
>   %781 = extractelement <33 x float> %765, i32 16
>   %782 = extractelement <33 x float> %765, i32 17
>   %783 = extractelement <33 x float> %765, i32 18
>   %784 = extractelement <33 x float> %765, i32 19
>   %785 = extractelement <33 x float> %765, i32 20
>   %786 = extractelement <33 x float> %765, i32 21
>   %787 = extractelement <33 x float> %765, i32 22
>   %788 = extractelement <33 x float> %765, i32 23
>   %789 = extractelement <33 x float> %765, i32 24
>   %790 = extractelement <33 x float> %765, i32 25
>   %791 = extractelement <33 x float> %765, i32 26
>   %792 = extractelement <33 x float> %765, i32 27
>   %793 = extractelement <33 x float> %765, i32 28
>   %794 = extractelement <33 x float> %765, i32 29
>   %795 = extractelement <33 x float> %765, i32 30
>   %796 = extractelement <33 x float> %765, i32 31
>   %797 = extractelement <33 x float> %765, i32 32
>   %798 = add i32 %729, 1
>   %array_vector594 = insertelement <33 x float> undef, float %591, i32 0
>   %array_vector626 = shufflevector <33 x float> %array_vector594, <33 x float> <float undef, float 0x3FF3333340000000, float 0x40019999A0000000, float 0x40099999A0000000, float 0x4010CCCCC0000000, float 0x4014CCCCC0000000, float 0x4018CCCCC0000000, float 0x401CCCCCC0000000, float 0x4020666660000000, float 0x4022666660000000, float 0x4024666660000000, float 0x4026666660000000, float 0x4028666660000000, float 0x402A666660000000, float 0x402C666660000000, float 0x402E666660000000, float 0x4030333340000000, float 0x4031333340000000, float 0x4032333340000000, float 0x4033333340000000, float 0x4034333340000000, float 0x4035333340000000, float 0x4036333340000000, float 0x4037333340000000, float 0x4038333340000000, float 0x4039333340000000, float 0x403A333340000000, float 0x403B333340000000, float 0x403C333340000000, float 0x403D333340000000, float 0x403E333340000000, float 0x403F333340000000, float 0x40401999A0000000>, <33 x i32> <i32 0, i32 34, i32 35, i32 36, i32 37, i32 38, i32 39, i32 40, i32 41, i32 42, i32 43, i32 44, i32 45, i32 46, i32 47, i32 48, i32 49, i32 50, i32 51, i32 52, i32 53, i32 54, i32 55, i32 56, i32 57, i32 58, i32 59, i32 60, i32 61, i32 62, i32 63, i32 64, i32 65>
>   %799 = insertelement <33 x float> %array_vector626, float 0x3FC99999A0000000, i32 %798
>   %800 = extractelement <33 x float> %799, i32 1
>   %801 = extractelement <33 x float> %799, i32 2
>   %802 = extractelement <33 x float> %799, i32 3
>   %803 = extractelement <33 x float> %799, i32 4
>   %804 = extractelement <33 x float> %799, i32 5
>   %805 = extractelement <33 x float> %799, i32 6
>   %806 = extractelement <33 x float> %799, i32 7
>   %807 = extractelement <33 x float> %799, i32 8
>   %808 = extractelement <33 x float> %799, i32 9
>   %809 = extractelement <33 x float> %799, i32 10
>   %810 = extractelement <33 x float> %799, i32 11
>   %811 = extractelement <33 x float> %799, i32 12
>   %812 = extractelement <33 x float> %799, i32 13
>   %813 = extractelement <33 x float> %799, i32 14
>   %814 = extractelement <33 x float> %799, i32 15
>   %815 = extractelement <33 x float> %799, i32 16
>   %816 = extractelement <33 x float> %799, i32 17
>   %817 = extractelement <33 x float> %799, i32 18
>   %818 = extractelement <33 x float> %799, i32 19
>   %819 = extractelement <33 x float> %799, i32 20
>   %820 = extractelement <33 x float> %799, i32 21
>   %821 = extractelement <33 x float> %799, i32 22
>   %822 = extractelement <33 x float> %799, i32 23
>   %823 = extractelement <33 x float> %799, i32 24
>   %824 = extractelement <33 x float> %799, i32 25
>   %825 = extractelement <33 x float> %799, i32 26
>   %826 = extractelement <33 x float> %799, i32 27
>   %827 = extractelement <33 x float> %799, i32 28
>   %828 = extractelement <33 x float> %799, i32 29
>   %829 = extractelement <33 x float> %799, i32 30
>   %830 = extractelement <33 x float> %799, i32 31
>   %831 = extractelement <33 x float> %799, i32 32
>   br label %endif126
>
> else123:                                          ; preds = %endif83
>   %832 = add i32 %729, 1
>   %array_vector660 = insertelement <33 x float> undef, float %585, i32 0
>   %array_vector692 = shufflevector <33 x float> %array_vector660, <33 x float> <float undef, float 1.000000e+00, float 2.000000e+00, float 3.000000e+00, float 4.000000e+00, float 5.000000e+00, float 6.000000e+00, float 7.000000e+00, float 8.000000e+00, float 9.000000e+00, float 1.000000e+01, float 1.100000e+01, float 1.200000e+01, float 1.300000e+01, float 1.400000e+01, float 1.500000e+01, float 1.600000e+01, float 1.700000e+01, float 1.800000e+01, float 1.900000e+01, float 2.000000e+01, float 2.100000e+01, float 2.200000e+01, float 2.300000e+01, float 2.400000e+01, float 2.500000e+01, float 2.600000e+01, float 2.700000e+01, float 2.800000e+01, float 2.900000e+01, float 3.000000e+01, float 3.100000e+01, float 3.200000e+01>, <33 x i32> <i32 0, i32 34, i32 35, i32 36, i32 37, i32 38, i32 39, i32 40, i32 41, i32 42, i32 43, i32 44, i32 45, i32 46, i32 47, i32 48, i32 49, i32 50, i32 51, i32 52, i32 53, i32 54, i32 55, i32 56, i32 57, i32 58, i32 59, i32 60, i32 61, i32 62, i32 63, i32 64, i32 65>
>   %833 = insertelement <33 x float> %array_vector692, float 0.000000e+00, i32 %832
>   %834 = extractelement <33 x float> %833, i32 1
>   %835 = extractelement <33 x float> %833, i32 2
>   %836 = extractelement <33 x float> %833, i32 3
>   %837 = extractelement <33 x float> %833, i32 4
>   %838 = extractelement <33 x float> %833, i32 5
>   %839 = extractelement <33 x float> %833, i32 6
>   %840 = extractelement <33 x float> %833, i32 7
>   %841 = extractelement <33 x float> %833, i32 8
>   %842 = extractelement <33 x float> %833, i32 9
>   %843 = extractelement <33 x float> %833, i32 10
>   %844 = extractelement <33 x float> %833, i32 11
>   %845 = extractelement <33 x float> %833, i32 12
>   %846 = extractelement <33 x float> %833, i32 13
>   %847 = extractelement <33 x float> %833, i32 14
>   %848 = extractelement <33 x float> %833, i32 15
>   %849 = extractelement <33 x float> %833, i32 16
>   %850 = extractelement <33 x float> %833, i32 17
>   %851 = extractelement <33 x float> %833, i32 18
>   %852 = extractelement <33 x float> %833, i32 19
>   %853 = extractelement <33 x float> %833, i32 20
>   %854 = extractelement <33 x float> %833, i32 21
>   %855 = extractelement <33 x float> %833, i32 22
>   %856 = extractelement <33 x float> %833, i32 23
>   %857 = extractelement <33 x float> %833, i32 24
>   %858 = extractelement <33 x float> %833, i32 25
>   %859 = extractelement <33 x float> %833, i32 26
>   %860 = extractelement <33 x float> %833, i32 27
>   %861 = extractelement <33 x float> %833, i32 28
>   %862 = extractelement <33 x float> %833, i32 29
>   %863 = extractelement <33 x float> %833, i32 30
>   %864 = extractelement <33 x float> %833, i32 31
>   %865 = extractelement <33 x float> %833, i32 32
>   %866 = add i32 %729, 1
>   %array_vector693 = insertelement <33 x float> undef, float %588, i32 0
>   %array_vector725 = shufflevector <33 x float> %array_vector693, <33 x float> <float undef, float 0x3FF19999A0000000, float 0x4000CCCCC0000000, float 0x4008CCCCC0000000, float 0x4010666660000000, float 0x4014666660000000, float 0x4018666660000000, float 0x401C666660000000, float 0x4020333340000000, float 0x4022333340000000, float 0x4024333340000000, float 0x4026333340000000, float 0x4028333340000000, float 0x402A333340000000, float 0x402C333340000000, float 0x402E333340000000, float 0x40301999A0000000, float 0x40311999A0000000, float 0x40321999A0000000, float 0x40331999A0000000, float 0x40341999A0000000, float 0x40351999A0000000, float 0x40361999A0000000, float 0x40371999A0000000, float 0x40381999A0000000, float 0x40391999A0000000, float 0x403A1999A0000000, float 0x403B1999A0000000, float 0x403C1999A0000000, float 0x403D1999A0000000, float 0x403E1999A0000000, float 0x403F1999A0000000, float 0x40400CCCC0000000>, <33 x i32> <i32 0, i32 34, i32 35, i32 36, i32 37, i32 38, i32 39, i32 40, i32 41, i32 42, i32 43, i32 44, i32 45, i32 46, i32 47, i32 48, i32 49, i32 50, i32 51, i32 52, i32 53, i32 54, i32 55, i32 56, i32 57, i32 58, i32 59, i32 60, i32 61, i32 62, i32 63, i32 64, i32 65>
>   %867 = insertelement <33 x float> %array_vector725, float 0x3FB99999A0000000, i32 %866
>   %868 = extractelement <33 x float> %867, i32 1
>   %869 = extractelement <33 x float> %867, i32 2
>   %870 = extractelement <33 x float> %867, i32 3
>   %871 = extractelement <33 x float> %867, i32 4
>   %872 = extractelement <33 x float> %867, i32 5
>   %873 = extractelement <33 x float> %867, i32 6
>   %874 = extractelement <33 x float> %867, i32 7
>   %875 = extractelement <33 x float> %867, i32 8
>   %876 = extractelement <33 x float> %867, i32 9
>   %877 = extractelement <33 x float> %867, i32 10
>   %878 = extractelement <33 x float> %867, i32 11
>   %879 = extractelement <33 x float> %867, i32 12
>   %880 = extractelement <33 x float> %867, i32 13
>   %881 = extractelement <33 x float> %867, i32 14
>   %882 = extractelement <33 x float> %867, i32 15
>   %883 = extractelement <33 x float> %867, i32 16
>   %884 = extractelement <33 x float> %867, i32 17
>   %885 = extractelement <33 x float> %867, i32 18
>   %886 = extractelement <33 x float> %867, i32 19
>   %887 = extractelement <33 x float> %867, i32 20
>   %888 = extractelement <33 x float> %867, i32 21
>   %889 = extractelement <33 x float> %867, i32 22
>   %890 = extractelement <33 x float> %867, i32 23
>   %891 = extractelement <33 x float> %867, i32 24
>   %892 = extractelement <33 x float> %867, i32 25
>   %893 = extractelement <33 x float> %867, i32 26
>   %894 = extractelement <33 x float> %867, i32 27
>   %895 = extractelement <33 x float> %867, i32 28
>   %896 = extractelement <33 x float> %867, i32 29
>   %897 = extractelement <33 x float> %867, i32 30
>   %898 = extractelement <33 x float> %867, i32 31
>   %899 = extractelement <33 x float> %867, i32 32
>   %900 = add i32 %729, 1
>   %array_vector726 = insertelement <33 x float> undef, float %591, i32 0
>   %array_vector758 = shufflevector <33 x float> %array_vector726, <33 x float> <float undef, float 0x3FF3333340000000, float 0x40019999A0000000, float 0x40099999A0000000, float 0x4010CCCCC0000000, float 0x4014CCCCC0000000, float 0x4018CCCCC0000000, float 0x401CCCCCC0000000, float 0x4020666660000000, float 0x4022666660000000, float 0x4024666660000000, float 0x4026666660000000, float 0x4028666660000000, float 0x402A666660000000, float 0x402C666660000000, float 0x402E666660000000, float 0x4030333340000000, float 0x4031333340000000, float 0x4032333340000000, float 0x4033333340000000, float 0x4034333340000000, float 0x4035333340000000, float 0x4036333340000000, float 0x4037333340000000, float 0x4038333340000000, float 0x4039333340000000, float 0x403A333340000000, float 0x403B333340000000, float 0x403C333340000000, float 0x403D333340000000, float 0x403E333340000000, float 0x403F333340000000, float 0x40401999A0000000>, <33 x i32> <i32 0, i32 34, i32 35, i32 36, i32 37, i32 38, i32 39, i32 40, i32 41, i32 42, i32 43, i32 44, i32 45, i32 46, i32 47, i32 48, i32 49, i32 50, i32 51, i32 52, i32 53, i32 54, i32 55, i32 56, i32 57, i32 58, i32 59, i32 60, i32 61, i32 62, i32 63, i32 64, i32 65>
>   %901 = insertelement <33 x float> %array_vector758, float 0x3FC99999A0000000, i32 %900
>   %902 = extractelement <33 x float> %901, i32 1
>   %903 = extractelement <33 x float> %901, i32 2
>   %904 = extractelement <33 x float> %901, i32 3
>   %905 = extractelement <33 x float> %901, i32 4
>   %906 = extractelement <33 x float> %901, i32 5
>   %907 = extractelement <33 x float> %901, i32 6
>   %908 = extractelement <33 x float> %901, i32 7
>   %909 = extractelement <33 x float> %901, i32 8
>   %910 = extractelement <33 x float> %901, i32 9
>   %911 = extractelement <33 x float> %901, i32 10
>   %912 = extractelement <33 x float> %901, i32 11
>   %913 = extractelement <33 x float> %901, i32 12
>   %914 = extractelement <33 x float> %901, i32 13
>   %915 = extractelement <33 x float> %901, i32 14
>   %916 = extractelement <33 x float> %901, i32 15
>   %917 = extractelement <33 x float> %901, i32 16
>   %918 = extractelement <33 x float> %901, i32 17
>   %919 = extractelement <33 x float> %901, i32 18
>   %920 = extractelement <33 x float> %901, i32 19
>   %921 = extractelement <33 x float> %901, i32 20
>   %922 = extractelement <33 x float> %901, i32 21
>   %923 = extractelement <33 x float> %901, i32 22
>   %924 = extractelement <33 x float> %901, i32 23
>   %925 = extractelement <33 x float> %901, i32 24
>   %926 = extractelement <33 x float> %901, i32 25
>   %927 = extractelement <33 x float> %901, i32 26
>   %928 = extractelement <33 x float> %901, i32 27
>   %929 = extractelement <33 x float> %901, i32 28
>   %930 = extractelement <33 x float> %901, i32 29
>   %931 = extractelement <33 x float> %901, i32 30
>   %932 = extractelement <33 x float> %901, i32 31
>   %933 = extractelement <33 x float> %901, i32 32
>   br label %endif126
>
> endif126:                                         ; preds = %else123, %if119
>   %OUT1.y.2 = phi float [ %868, %else123 ], [ %766, %if119 ]
>   %OUT1.z.2 = phi float [ %902, %else123 ], [ %800, %if119 ]
>   %OUT2.x.2 = phi float [ %835, %else123 ], [ %733, %if119 ]
>   %OUT2.y.2 = phi float [ %869, %else123 ], [ %767, %if119 ]
>   %OUT2.z.2 = phi float [ %903, %else123 ], [ %801, %if119 ]
>   %OUT3.x.2 = phi float [ %836, %else123 ], [ %734, %if119 ]
>   %OUT3.y.2 = phi float [ %870, %else123 ], [ %768, %if119 ]
>   %OUT3.z.2 = phi float [ %904, %else123 ], [ %802, %if119 ]
>   %OUT4.x.2 = phi float [ %837, %else123 ], [ %735, %if119 ]
>   %OUT4.y.2 = phi float [ %871, %else123 ], [ %769, %if119 ]
>   %OUT4.z.2 = phi float [ %905, %else123 ], [ %803, %if119 ]
>   %OUT5.x.2 = phi float [ %838, %else123 ], [ %736, %if119 ]
>   %OUT5.y.2 = phi float [ %872, %else123 ], [ %770, %if119 ]
>   %OUT5.z.2 = phi float [ %906, %else123 ], [ %804, %if119 ]
>   %OUT6.x.2 = phi float [ %839, %else123 ], [ %737, %if119 ]
>   %OUT6.y.2 = phi float [ %873, %else123 ], [ %771, %if119 ]
>   %OUT6.z.2 = phi float [ %907, %else123 ], [ %805, %if119 ]
>   %OUT7.x.2 = phi float [ %840, %else123 ], [ %738, %if119 ]
>   %OUT7.y.2 = phi float [ %874, %else123 ], [ %772, %if119 ]
>   %OUT7.z.2 = phi float [ %908, %else123 ], [ %806, %if119 ]
>   %OUT8.x.2 = phi float [ %841, %else123 ], [ %739, %if119 ]
>   %OUT8.y.2 = phi float [ %875, %else123 ], [ %773, %if119 ]
>   %OUT8.z.2 = phi float [ %909, %else123 ], [ %807, %if119 ]
>   %OUT9.x.2 = phi float [ %842, %else123 ], [ %740, %if119 ]
>   %OUT9.y.2 = phi float [ %876, %else123 ], [ %774, %if119 ]
>   %OUT9.z.2 = phi float [ %910, %else123 ], [ %808, %if119 ]
>   %OUT10.x.2 = phi float [ %843, %else123 ], [ %741, %if119 ]
>   %OUT10.y.2 = phi float [ %877, %else123 ], [ %775, %if119 ]
>   %OUT10.z.2 = phi float [ %911, %else123 ], [ %809, %if119 ]
>   %OUT11.x.2 = phi float [ %844, %else123 ], [ %742, %if119 ]
>   %OUT11.y.2 = phi float [ %878, %else123 ], [ %776, %if119 ]
>   %OUT11.z.2 = phi float [ %912, %else123 ], [ %810, %if119 ]
>   %OUT12.x.2 = phi float [ %845, %else123 ], [ %743, %if119 ]
>   %OUT12.y.2 = phi float [ %879, %else123 ], [ %777, %if119 ]
>   %OUT12.z.2 = phi float [ %913, %else123 ], [ %811, %if119 ]
>   %OUT13.x.2 = phi float [ %846, %else123 ], [ %744, %if119 ]
>   %OUT13.y.2 = phi float [ %880, %else123 ], [ %778, %if119 ]
>   %OUT13.z.2 = phi float [ %914, %else123 ], [ %812, %if119 ]
>   %OUT14.x.2 = phi float [ %847, %else123 ], [ %745, %if119 ]
>   %OUT14.y.2 = phi float [ %881, %else123 ], [ %779, %if119 ]
>   %OUT14.z.2 = phi float [ %915, %else123 ], [ %813, %if119 ]
>   %OUT15.x.2 = phi float [ %848, %else123 ], [ %746, %if119 ]
>   %OUT15.y.2 = phi float [ %882, %else123 ], [ %780, %if119 ]
>   %OUT15.z.2 = phi float [ %916, %else123 ], [ %814, %if119 ]
>   %OUT16.x.2 = phi float [ %849, %else123 ], [ %747, %if119 ]
>   %OUT16.y.2 = phi float [ %883, %else123 ], [ %781, %if119 ]
>   %OUT16.z.2 = phi float [ %917, %else123 ], [ %815, %if119 ]
>   %OUT17.x.2 = phi float [ %850, %else123 ], [ %748, %if119 ]
>   %OUT17.y.2 = phi float [ %884, %else123 ], [ %782, %if119 ]
>   %OUT17.z.2 = phi float [ %918, %else123 ], [ %816, %if119 ]
>   %OUT18.x.2 = phi float [ %851, %else123 ], [ %749, %if119 ]
>   %OUT18.y.2 = phi float [ %885, %else123 ], [ %783, %if119 ]
>   %OUT18.z.2 = phi float [ %919, %else123 ], [ %817, %if119 ]
>   %OUT19.x.2 = phi float [ %852, %else123 ], [ %750, %if119 ]
>   %OUT19.y.2 = phi float [ %886, %else123 ], [ %784, %if119 ]
>   %OUT19.z.2 = phi float [ %920, %else123 ], [ %818, %if119 ]
>   %OUT20.x.2 = phi float [ %853, %else123 ], [ %751, %if119 ]
>   %OUT20.y.2 = phi float [ %887, %else123 ], [ %785, %if119 ]
>   %OUT20.z.2 = phi float [ %921, %else123 ], [ %819, %if119 ]
>   %OUT21.x.2 = phi float [ %854, %else123 ], [ %752, %if119 ]
>   %OUT21.y.2 = phi float [ %888, %else123 ], [ %786, %if119 ]
>   %OUT21.z.2 = phi float [ %922, %else123 ], [ %820, %if119 ]
>   %OUT22.x.2 = phi float [ %855, %else123 ], [ %753, %if119 ]
>   %OUT22.y.2 = phi float [ %889, %else123 ], [ %787, %if119 ]
>   %OUT22.z.2 = phi float [ %923, %else123 ], [ %821, %if119 ]
>   %OUT23.x.2 = phi float [ %856, %else123 ], [ %754, %if119 ]
>   %OUT23.y.2 = phi float [ %890, %else123 ], [ %788, %if119 ]
>   %OUT23.z.2 = phi float [ %924, %else123 ], [ %822, %if119 ]
>   %OUT24.x.2 = phi float [ %857, %else123 ], [ %755, %if119 ]
>   %OUT24.y.2 = phi float [ %891, %else123 ], [ %789, %if119 ]
>   %OUT24.z.2 = phi float [ %925, %else123 ], [ %823, %if119 ]
>   %OUT25.x.2 = phi float [ %858, %else123 ], [ %756, %if119 ]
>   %OUT25.y.2 = phi float [ %892, %else123 ], [ %790, %if119 ]
>   %OUT25.z.2 = phi float [ %926, %else123 ], [ %824, %if119 ]
>   %OUT26.x.2 = phi float [ %859, %else123 ], [ %757, %if119 ]
>   %OUT26.y.2 = phi float [ %893, %else123 ], [ %791, %if119 ]
>   %OUT26.z.2 = phi float [ %927, %else123 ], [ %825, %if119 ]
>   %OUT27.x.2 = phi float [ %860, %else123 ], [ %758, %if119 ]
>   %OUT27.y.2 = phi float [ %894, %else123 ], [ %792, %if119 ]
>   %OUT27.z.2 = phi float [ %928, %else123 ], [ %826, %if119 ]
>   %OUT28.x.2 = phi float [ %861, %else123 ], [ %759, %if119 ]
>   %OUT28.y.2 = phi float [ %895, %else123 ], [ %793, %if119 ]
>   %OUT28.z.2 = phi float [ %929, %else123 ], [ %827, %if119 ]
>   %OUT29.x.2 = phi float [ %862, %else123 ], [ %760, %if119 ]
>   %OUT29.y.2 = phi float [ %896, %else123 ], [ %794, %if119 ]
>   %OUT29.z.2 = phi float [ %930, %else123 ], [ %828, %if119 ]
>   %OUT30.x.2 = phi float [ %863, %else123 ], [ %761, %if119 ]
>   %OUT30.y.2 = phi float [ %897, %else123 ], [ %795, %if119 ]
>   %OUT30.z.2 = phi float [ %931, %else123 ], [ %829, %if119 ]
>   %OUT31.x.2 = phi float [ %864, %else123 ], [ %762, %if119 ]
>   %OUT31.y.2 = phi float [ %898, %else123 ], [ %796, %if119 ]
>   %OUT31.z.2 = phi float [ %932, %else123 ], [ %830, %if119 ]
>   %OUT32.x.2 = phi float [ %865, %else123 ], [ %763, %if119 ]
>   %OUT32.y.2 = phi float [ %899, %else123 ], [ %797, %if119 ]
>   %OUT32.z.2 = phi float [ %933, %else123 ], [ %831, %if119 ]
>   %OUT1.x.2 = phi float [ %834, %else123 ], [ %732, %if119 ]
>   %.sink886 = add i32 %729, 1
>   %array_vector759 = insertelement <33 x float> undef, float %594, i32 0
>   %array_vector791 = shufflevector <33 x float> %array_vector759, <33 x float> <float undef, float 0x3FF4CCCCC0000000, float 0x4002666660000000, float 0x400A666660000000, float 0x4011333340000000, float 0x4015333340000000, float 0x4019333340000000, float 0x401D333340000000, float 0x40209999A0000000, float 0x40229999A0000000, float 0x40249999A0000000, float 0x40269999A0000000, float 0x40289999A0000000, float 0x402A9999A0000000, float 0x402C9999A0000000, float 0x402E9999A0000000, float 0x40304CCCC0000000, float 0x40314CCCC0000000, float 0x40324CCCC0000000, float 0x40334CCCC0000000, float 0x40344CCCC0000000, float 0x40354CCCC0000000, float 0x40364CCCC0000000, float 0x40374CCCC0000000, float 0x40384CCCC0000000, float 0x40394CCCC0000000, float 0x403A4CCCC0000000, float 0x403B4CCCC0000000, float 0x403C4CCCC0000000, float 0x403D4CCCC0000000, float 0x403E4CCCC0000000, float 0x403F4CCCC0000000, float 0x4040266660000000>, <33 x i32> <i32 0, i32 34, i32 35, i32 36, i32 37, i32 38, i32 39, i32 40, i32 41, i32 42, i32 43, i32 44, i32 45, i32 46, i32 47, i32 48, i32 49, i32 50, i32 51, i32 52, i32 53, i32 54, i32 55, i32 56, i32 57, i32 58, i32 59, i32 60, i32 61, i32 62, i32 63, i32 64, i32 65>
>   %934 = insertelement <33 x float> %array_vector791, float 0x3FD3333340000000, i32 %.sink886
>   %935 = shl i32 %10, 2
>   %936 = call i32 @llvm.SI.buffer.load.dword.i32.i32(<16 x i8> %16, i32 %935, i32 0, i32 0, i32 1, i32 0, i32 1, i32 0, i32 0)
>   %937 = shl i32 %10, 2
>   %938 = call i32 @llvm.SI.buffer.load.dword.i32.i32(<16 x i8> %16, i32 %937, i32 256, i32 0, i32 1, i32 0, i32 1, i32 0, i32 0)
>   %939 = shl i32 %10, 2
>   %940 = call i32 @llvm.SI.buffer.load.dword.i32.i32(<16 x i8> %16, i32 %939, i32 512, i32 0, i32 1, i32 0, i32 1, i32 0, i32 0)
>   %941 = shl i32 %10, 2
>   %942 = call i32 @llvm.SI.buffer.load.dword.i32.i32(<16 x i8> %16, i32 %941, i32 768, i32 0, i32 1, i32 0, i32 1, i32 0, i32 0)
>   call void @llvm.AMDGPU.kill(float 1.000000e+00)
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %936, i32 1, i32 8, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %938, i32 1, i32 20, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %940, i32 1, i32 32, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %942, i32 1, i32 44, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %943 = bitcast float %OUT1.x.2 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %943, i32 1, i32 56, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %944 = bitcast float %OUT1.y.2 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %944, i32 1, i32 68, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %945 = bitcast float %OUT1.z.2 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %945, i32 1, i32 80, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %bc950 = bitcast <33 x float> %934 to <33 x i32>
>   %946 = extractelement <33 x i32> %bc950, i32 1
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %946, i32 1, i32 92, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %947 = bitcast float %OUT2.x.2 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %947, i32 1, i32 104, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %948 = bitcast float %OUT2.y.2 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %948, i32 1, i32 116, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %949 = bitcast float %OUT2.z.2 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %949, i32 1, i32 128, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %bc951 = bitcast <33 x float> %934 to <33 x i32>
>   %950 = extractelement <33 x i32> %bc951, i32 2
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %950, i32 1, i32 140, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %951 = bitcast float %OUT3.x.2 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %951, i32 1, i32 152, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %952 = bitcast float %OUT3.y.2 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %952, i32 1, i32 164, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %953 = bitcast float %OUT3.z.2 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %953, i32 1, i32 176, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %bc952 = bitcast <33 x float> %934 to <33 x i32>
>   %954 = extractelement <33 x i32> %bc952, i32 3
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %954, i32 1, i32 188, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %955 = bitcast float %OUT4.x.2 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %955, i32 1, i32 200, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %956 = bitcast float %OUT4.y.2 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %956, i32 1, i32 212, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %957 = bitcast float %OUT4.z.2 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %957, i32 1, i32 224, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %bc953 = bitcast <33 x float> %934 to <33 x i32>
>   %958 = extractelement <33 x i32> %bc953, i32 4
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %958, i32 1, i32 236, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %959 = bitcast float %OUT5.x.2 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %959, i32 1, i32 248, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %960 = bitcast float %OUT5.y.2 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %960, i32 1, i32 260, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %961 = bitcast float %OUT5.z.2 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %961, i32 1, i32 272, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %bc954 = bitcast <33 x float> %934 to <33 x i32>
>   %962 = extractelement <33 x i32> %bc954, i32 5
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %962, i32 1, i32 284, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %963 = bitcast float %OUT6.x.2 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %963, i32 1, i32 296, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %964 = bitcast float %OUT6.y.2 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %964, i32 1, i32 308, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %965 = bitcast float %OUT6.z.2 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %965, i32 1, i32 320, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %bc955 = bitcast <33 x float> %934 to <33 x i32>
>   %966 = extractelement <33 x i32> %bc955, i32 6
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %966, i32 1, i32 332, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %967 = bitcast float %OUT7.x.2 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %967, i32 1, i32 344, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %968 = bitcast float %OUT7.y.2 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %968, i32 1, i32 356, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %969 = bitcast float %OUT7.z.2 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %969, i32 1, i32 368, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %bc956 = bitcast <33 x float> %934 to <33 x i32>
>   %970 = extractelement <33 x i32> %bc956, i32 7
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %970, i32 1, i32 380, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %971 = bitcast float %OUT8.x.2 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %971, i32 1, i32 392, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %972 = bitcast float %OUT8.y.2 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %972, i32 1, i32 404, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %973 = bitcast float %OUT8.z.2 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %973, i32 1, i32 416, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %bc957 = bitcast <33 x float> %934 to <33 x i32>
>   %974 = extractelement <33 x i32> %bc957, i32 8
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %974, i32 1, i32 428, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %975 = bitcast float %OUT9.x.2 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %975, i32 1, i32 440, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %976 = bitcast float %OUT9.y.2 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %976, i32 1, i32 452, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %977 = bitcast float %OUT9.z.2 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %977, i32 1, i32 464, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %bc958 = bitcast <33 x float> %934 to <33 x i32>
>   %978 = extractelement <33 x i32> %bc958, i32 9
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %978, i32 1, i32 476, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %979 = bitcast float %OUT10.x.2 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %979, i32 1, i32 488, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %980 = bitcast float %OUT10.y.2 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %980, i32 1, i32 500, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %981 = bitcast float %OUT10.z.2 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %981, i32 1, i32 512, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %bc959 = bitcast <33 x float> %934 to <33 x i32>
>   %982 = extractelement <33 x i32> %bc959, i32 10
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %982, i32 1, i32 524, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %983 = bitcast float %OUT11.x.2 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %983, i32 1, i32 536, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %984 = bitcast float %OUT11.y.2 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %984, i32 1, i32 548, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %985 = bitcast float %OUT11.z.2 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %985, i32 1, i32 560, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %bc960 = bitcast <33 x float> %934 to <33 x i32>
>   %986 = extractelement <33 x i32> %bc960, i32 11
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %986, i32 1, i32 572, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %987 = bitcast float %OUT12.x.2 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %987, i32 1, i32 584, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %988 = bitcast float %OUT12.y.2 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %988, i32 1, i32 596, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %989 = bitcast float %OUT12.z.2 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %989, i32 1, i32 608, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %bc961 = bitcast <33 x float> %934 to <33 x i32>
>   %990 = extractelement <33 x i32> %bc961, i32 12
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %990, i32 1, i32 620, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %991 = bitcast float %OUT13.x.2 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %991, i32 1, i32 632, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %992 = bitcast float %OUT13.y.2 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %992, i32 1, i32 644, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %993 = bitcast float %OUT13.z.2 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %993, i32 1, i32 656, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %bc962 = bitcast <33 x float> %934 to <33 x i32>
>   %994 = extractelement <33 x i32> %bc962, i32 13
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %994, i32 1, i32 668, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %995 = bitcast float %OUT14.x.2 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %995, i32 1, i32 680, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %996 = bitcast float %OUT14.y.2 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %996, i32 1, i32 692, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %997 = bitcast float %OUT14.z.2 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %997, i32 1, i32 704, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %bc963 = bitcast <33 x float> %934 to <33 x i32>
>   %998 = extractelement <33 x i32> %bc963, i32 14
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %998, i32 1, i32 716, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %999 = bitcast float %OUT15.x.2 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %999, i32 1, i32 728, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %1000 = bitcast float %OUT15.y.2 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %1000, i32 1, i32 740, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %1001 = bitcast float %OUT15.z.2 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %1001, i32 1, i32 752, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %bc964 = bitcast <33 x float> %934 to <33 x i32>
>   %1002 = extractelement <33 x i32> %bc964, i32 15
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %1002, i32 1, i32 764, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %1003 = bitcast float %OUT16.x.2 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %1003, i32 1, i32 776, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %1004 = bitcast float %OUT16.y.2 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %1004, i32 1, i32 788, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %1005 = bitcast float %OUT16.z.2 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %1005, i32 1, i32 800, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %bc965 = bitcast <33 x float> %934 to <33 x i32>
>   %1006 = extractelement <33 x i32> %bc965, i32 16
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %1006, i32 1, i32 812, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %1007 = bitcast float %OUT17.x.2 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %1007, i32 1, i32 824, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %1008 = bitcast float %OUT17.y.2 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %1008, i32 1, i32 836, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %1009 = bitcast float %OUT17.z.2 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %1009, i32 1, i32 848, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %bc966 = bitcast <33 x float> %934 to <33 x i32>
>   %1010 = extractelement <33 x i32> %bc966, i32 17
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %1010, i32 1, i32 860, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %1011 = bitcast float %OUT18.x.2 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %1011, i32 1, i32 872, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %1012 = bitcast float %OUT18.y.2 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %1012, i32 1, i32 884, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %1013 = bitcast float %OUT18.z.2 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %1013, i32 1, i32 896, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %bc967 = bitcast <33 x float> %934 to <33 x i32>
>   %1014 = extractelement <33 x i32> %bc967, i32 18
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %1014, i32 1, i32 908, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %1015 = bitcast float %OUT19.x.2 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %1015, i32 1, i32 920, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %1016 = bitcast float %OUT19.y.2 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %1016, i32 1, i32 932, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %1017 = bitcast float %OUT19.z.2 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %1017, i32 1, i32 944, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %bc968 = bitcast <33 x float> %934 to <33 x i32>
>   %1018 = extractelement <33 x i32> %bc968, i32 19
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %1018, i32 1, i32 956, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %1019 = bitcast float %OUT20.x.2 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %1019, i32 1, i32 968, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %1020 = bitcast float %OUT20.y.2 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %1020, i32 1, i32 980, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %1021 = bitcast float %OUT20.z.2 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %1021, i32 1, i32 992, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %bc969 = bitcast <33 x float> %934 to <33 x i32>
>   %1022 = extractelement <33 x i32> %bc969, i32 20
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %1022, i32 1, i32 1004, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %1023 = bitcast float %OUT21.x.2 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %1023, i32 1, i32 1016, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %1024 = bitcast float %OUT21.y.2 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %1024, i32 1, i32 1028, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %1025 = bitcast float %OUT21.z.2 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %1025, i32 1, i32 1040, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %bc970 = bitcast <33 x float> %934 to <33 x i32>
>   %1026 = extractelement <33 x i32> %bc970, i32 21
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %1026, i32 1, i32 1052, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %1027 = bitcast float %OUT22.x.2 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %1027, i32 1, i32 1064, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %1028 = bitcast float %OUT22.y.2 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %1028, i32 1, i32 1076, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %1029 = bitcast float %OUT22.z.2 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %1029, i32 1, i32 1088, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %bc971 = bitcast <33 x float> %934 to <33 x i32>
>   %1030 = extractelement <33 x i32> %bc971, i32 22
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %1030, i32 1, i32 1100, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %1031 = bitcast float %OUT23.x.2 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %1031, i32 1, i32 1112, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %1032 = bitcast float %OUT23.y.2 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %1032, i32 1, i32 1124, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %1033 = bitcast float %OUT23.z.2 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %1033, i32 1, i32 1136, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %bc972 = bitcast <33 x float> %934 to <33 x i32>
>   %1034 = extractelement <33 x i32> %bc972, i32 23
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %1034, i32 1, i32 1148, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %1035 = bitcast float %OUT24.x.2 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %1035, i32 1, i32 1160, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %1036 = bitcast float %OUT24.y.2 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %1036, i32 1, i32 1172, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %1037 = bitcast float %OUT24.z.2 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %1037, i32 1, i32 1184, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %bc973 = bitcast <33 x float> %934 to <33 x i32>
>   %1038 = extractelement <33 x i32> %bc973, i32 24
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %1038, i32 1, i32 1196, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %1039 = bitcast float %OUT25.x.2 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %1039, i32 1, i32 1208, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %1040 = bitcast float %OUT25.y.2 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %1040, i32 1, i32 1220, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %1041 = bitcast float %OUT25.z.2 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %1041, i32 1, i32 1232, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %bc974 = bitcast <33 x float> %934 to <33 x i32>
>   %1042 = extractelement <33 x i32> %bc974, i32 25
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %1042, i32 1, i32 1244, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %1043 = bitcast float %OUT26.x.2 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %1043, i32 1, i32 1256, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %1044 = bitcast float %OUT26.y.2 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %1044, i32 1, i32 1268, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %1045 = bitcast float %OUT26.z.2 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %1045, i32 1, i32 1280, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %bc975 = bitcast <33 x float> %934 to <33 x i32>
>   %1046 = extractelement <33 x i32> %bc975, i32 26
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %1046, i32 1, i32 1292, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %1047 = bitcast float %OUT27.x.2 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %1047, i32 1, i32 1304, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %1048 = bitcast float %OUT27.y.2 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %1048, i32 1, i32 1316, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %1049 = bitcast float %OUT27.z.2 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %1049, i32 1, i32 1328, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %bc976 = bitcast <33 x float> %934 to <33 x i32>
>   %1050 = extractelement <33 x i32> %bc976, i32 27
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %1050, i32 1, i32 1340, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %1051 = bitcast float %OUT28.x.2 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %1051, i32 1, i32 1352, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %1052 = bitcast float %OUT28.y.2 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %1052, i32 1, i32 1364, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %1053 = bitcast float %OUT28.z.2 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %1053, i32 1, i32 1376, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %bc977 = bitcast <33 x float> %934 to <33 x i32>
>   %1054 = extractelement <33 x i32> %bc977, i32 28
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %1054, i32 1, i32 1388, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %1055 = bitcast float %OUT29.x.2 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %1055, i32 1, i32 1400, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %1056 = bitcast float %OUT29.y.2 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %1056, i32 1, i32 1412, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %1057 = bitcast float %OUT29.z.2 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %1057, i32 1, i32 1424, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %bc978 = bitcast <33 x float> %934 to <33 x i32>
>   %1058 = extractelement <33 x i32> %bc978, i32 29
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %1058, i32 1, i32 1436, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %1059 = bitcast float %OUT30.x.2 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %1059, i32 1, i32 1448, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %1060 = bitcast float %OUT30.y.2 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %1060, i32 1, i32 1460, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %1061 = bitcast float %OUT30.z.2 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %1061, i32 1, i32 1472, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %bc979 = bitcast <33 x float> %934 to <33 x i32>
>   %1062 = extractelement <33 x i32> %bc979, i32 30
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %1062, i32 1, i32 1484, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %1063 = bitcast float %OUT31.x.2 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %1063, i32 1, i32 1496, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %1064 = bitcast float %OUT31.y.2 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %1064, i32 1, i32 1508, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %1065 = bitcast float %OUT31.z.2 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %1065, i32 1, i32 1520, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %bc980 = bitcast <33 x float> %934 to <33 x i32>
>   %1066 = extractelement <33 x i32> %bc980, i32 31
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %1066, i32 1, i32 1532, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %1067 = bitcast float %OUT32.x.2 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %1067, i32 1, i32 1544, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %1068 = bitcast float %OUT32.y.2 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %1068, i32 1, i32 1556, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %1069 = bitcast float %OUT32.z.2 to i32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %1069, i32 1, i32 1568, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   %bc981 = bitcast <33 x float> %934 to <33 x i32>
>   %1070 = extractelement <33 x i32> %bc981, i32 32
>   call void @llvm.SI.tbuffer.store.i32(<16 x i8> %18, i32 %1070, i32 1, i32 1580, i32 %5, i32 0, i32 4, i32 4, i32 1, i32 0, i32 1, i32 1, i32 0)
>   call void @llvm.SI.sendmsg(i32 34, i32 %6)
>   call void @llvm.SI.sendmsg(i32 3, i32 %6)
>   ret void
> }
>
> ; Function Attrs: nounwind readnone
> declare float @llvm.SI.load.const(<16 x i8>, i32) #0
>
> ; Function Attrs: nounwind readonly
> declare i32 @llvm.SI.buffer.load.dword.i32.i32(<16 x i8>, i32, i32, i32, i32, i32, i32, i32, i32) #1
>
> ; Function Attrs: nounwind
> declare void @llvm.AMDGPU.kill(float) #2
>
> ; Function Attrs: nounwind
> declare void @llvm.SI.tbuffer.store.i32(<16 x i8>, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32) #2
>
> ; Function Attrs: nounwind
> declare void @llvm.SI.sendmsg(i32, i32) #2
>
> attributes #0 = { nounwind readnone }
> attributes #1 = { nounwind readonly }
> attributes #2 = { nounwind }
>
> !0 = !{}
>
> shader_runner: /home/daenzer/src/llvm-git/llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp:289: virtual void llvm::SIRegisterInfo::resolveFrameIndex(llvm::MachineInstr&, unsigned int, int64_t) const: Assertion `isUInt<12>(NewOffset) && "offset should be legal"' failed.
>
> Thread 1 "shader_runner" received signal SIGABRT, Aborted.
> __GI_raise (sig=sig at entry=6) at ../sysdeps/unix/sysv/linux/raise.c:58
> 58	../sysdeps/unix/sysv/linux/raise.c: No such file or directory.
> (gdb) bt
> #0  __GI_raise (sig=sig at entry=6) at ../sysdeps/unix/sysv/linux/raise.c:58
> #1  0x00007ffff4fab40a in __GI_abort () at abort.c:89
> #2  0x00007ffff4fa2e47 in __assert_fail_base (fmt=<optimized out>, assertion=assertion at entry=0x7fffef848370 "isUInt<12>(NewOffset) && \"offset should be legal\"",
>     file=file at entry=0x7fffef8482c0 "/home/daenzer/src/llvm-git/llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp", line=line at entry=289,
>     function=function at entry=0x7fffef849480 <llvm::SIRegisterInfo::resolveFrameIndex(llvm::MachineInstr&, unsigned int, long) const::__PRETTY_FUNCTION__> "virtual void llvm::SIRegisterInfo::resolveFrameIndex(llvm::MachineInstr&, unsigned int, int64_t) const")
>     at assert.c:92
> #3  0x00007ffff4fa2ef2 in __GI___assert_fail (assertion=assertion at entry=0x7fffef848370 "isUInt<12>(NewOffset) && \"offset should be legal\"", file=file at entry=0x7fffef8482c0 "/home/daenzer/src/llvm-git/llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp",
>     line=line at entry=289,
>     function=function at entry=0x7fffef849480 <llvm::SIRegisterInfo::resolveFrameIndex(llvm::MachineInstr&, unsigned int, long) const::__PRETTY_FUNCTION__> "virtual void llvm::SIRegisterInfo::resolveFrameIndex(llvm::MachineInstr&, unsigned int, int64_t) const")
>     at assert.c:101
> #4  0x00007fffeeb2ea9e in llvm::SIRegisterInfo::resolveFrameIndex (this=<optimized out>, MI=..., BaseReg=2147485907, Offset=<optimized out>) at /home/daenzer/src/llvm-git/llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp:289
> #5  0x00007fffedb5da68 in (anonymous namespace)::LocalStackSlotPass::insertFrameReferenceRegisters (this=this at entry=0x555555d7dba0, Fn=...) at /home/daenzer/src/llvm-git/llvm/lib/CodeGen/LocalStackSlotAllocation.cpp:425
> #6  0x00007fffedb61420 in (anonymous namespace)::LocalStackSlotPass::runOnMachineFunction (this=0x555555d7dba0, MF=...) at /home/daenzer/src/llvm-git/llvm/lib/CodeGen/LocalStackSlotAllocation.cpp:130
> #7  0x00007fffedba9414 in llvm::MachineFunctionPass::runOnFunction (this=0x555555d7dba0, F=...) at /home/daenzer/src/llvm-git/llvm/lib/CodeGen/MachineFunctionPass.cpp:62
> #8  0x00007fffeda0ca19 in llvm::FPPassManager::runOnFunction (this=0x555555eea210, F=...) at /home/daenzer/src/llvm-git/llvm/lib/IR/LegacyPassManager.cpp:1513
> #9  0x00007fffeda0cabc in llvm::FPPassManager::runOnModule (this=0x555555eea210, M=...) at /home/daenzer/src/llvm-git/llvm/lib/IR/LegacyPassManager.cpp:1534
> #10 0x00007fffeda0d5ec in (anonymous namespace)::MPPassManager::runOnModule (M=..., this=<optimized out>) at /home/daenzer/src/llvm-git/llvm/lib/IR/LegacyPassManager.cpp:1590
> #11 llvm::legacy::PassManagerImpl::run (this=0x555555d80760, M=...) at /home/daenzer/src/llvm-git/llvm/lib/IR/LegacyPassManager.cpp:1693
> #12 0x00007fffeda0d7de in llvm::legacy::PassManager::run (this=this at entry=0x7fffffff4dd0, M=...) at /home/daenzer/src/llvm-git/llvm/lib/IR/LegacyPassManager.cpp:1724
> #13 0x00007fffee8c4862 in LLVMTargetMachineEmit (T=T at entry=0x555555900940, M=M at entry=0x555555d7b780, OS=..., codegen=codegen at entry=LLVMObjectFile, ErrorMessage=ErrorMessage at entry=0x7fffffff50a0)
>     at /home/daenzer/src/llvm-git/llvm/lib/Target/TargetMachineC.cpp:204
> #14 0x00007fffee8c4a4c in LLVMTargetMachineEmitToMemoryBuffer (T=T at entry=0x555555900940, M=M at entry=0x555555d7b780, codegen=codegen at entry=LLVMObjectFile, ErrorMessage=ErrorMessage at entry=0x7fffffff50a0, OutMemBuf=OutMemBuf at entry=0x7fffffff50a8)
>     at /home/daenzer/src/llvm-git/llvm/lib/Target/TargetMachineC.cpp:228
> #15 0x00007ffff1063046 in si_llvm_compile (M=M at entry=0x555555d7b780, binary=binary at entry=0x555555dbd680, tm=tm at entry=0x555555900940, debug=debug at entry=0x555555d764b0) at ../../../../../src/gallium/drivers/radeonsi/si_shader_tgsi_setup.c:224
> #16 0x00007ffff105c8d7 in si_compile_llvm (sscreen=sscreen at entry=0x5555557f13f0, binary=binary at entry=0x555555dbd680, conf=conf at entry=0x555555dbd6e0, tm=tm at entry=0x555555900940, mod=mod at entry=0x555555d7b780, debug=debug at entry=0x555555d764b0, processor=2,
>     name=0x7ffff12704ce "TGSI shader") at ../../../../../src/gallium/drivers/radeonsi/si_shader.c:6115
> #17 0x00007ffff105d590 in si_compile_tgsi_shader (sscreen=sscreen at entry=0x5555557f13f0, tm=tm at entry=0x555555900940, shader=shader at entry=0x555555dbd580, is_monolithic=is_monolithic at entry=false, debug=debug at entry=0x555555d764b0)
>     at ../../../../../src/gallium/drivers/radeonsi/si_shader.c:7261
> #18 0x00007ffff1076c53 in si_init_shader_selector_async (job=job at entry=0x555555d76440, thread_index=thread_index at entry=-1) at ../../../../../src/gallium/drivers/radeonsi/si_state_shaders.c:1328
> #19 0x00007ffff1077b14 in si_create_shader_selector (ctx=0x555555799720, state=<optimized out>) at ../../../../../src/gallium/drivers/radeonsi/si_state_shaders.c:1631
> #20 0x00007ffff0b7e714 in st_get_basic_variant (variants=0x555555de19c8, tgsi=0x555555de1788, pipe_shader=2, st=0x0) at ../../../src/mesa/state_tracker/st_program.c:1546
> #21 st_precompile_shader_variant (st=st at entry=0x555555935220, prog=prog at entry=0x555555de1320) at ../../../src/mesa/state_tracker/st_program.c:1927
> #22 0x00007ffff0b22b92 in st_program_string_notify (ctx=<optimized out>, target=<optimized out>, prog=0x555555de1320) at ../../../src/mesa/state_tracker/st_cb_program.c:280
> #23 0x00007ffff0b64229 in st_link_shader (ctx=<optimized out>, prog=<optimized out>) at ../../../src/mesa/state_tracker/st_glsl_to_tgsi.cpp:6893
> #24 0x00007ffff0b8fa45 in _mesa_glsl_link_shader (ctx=ctx at entry=0x555555901440, prog=prog at entry=0x555555daaf50) at ../../../src/mesa/program/ir_to_mesa.cpp:3066
> #25 0x00007ffff0a2f54d in _mesa_link_program (ctx=0x555555901440, shProg=0x555555daaf50) at ../../../src/mesa/main/shaderapi.c:1089
> #26 0x00007ffff7ac8765 in stub_glLinkProgram (program=4) at /home/daenzer/src/piglit-git/piglit/tests/util/piglit-dispatch-gen.c:33005
> #27 0x000055555555ce6c in link_and_use_shaders () at /home/daenzer/src/piglit-git/piglit/tests/shaders/shader_runner.c:1040
> #28 0x00005555555656cb in init_test (file=0x7fffffffe9d8 "/home/daenzer/src/piglit-git/piglit/tests/spec/glsl-1.50/execution/variable-indexing/gs-output-array-vec4-index-wr.shader_test") at /home/daenzer/src/piglit-git/piglit/tests/shaders/shader_runner.c:3686
> #29 0x0000555555566311 in piglit_init (argc=2, argv=0x7fffffffe698) at /home/daenzer/src/piglit-git/piglit/tests/shaders/shader_runner.c:4012
> #30 0x00007ffff7b39331 in run_test (gl_fw=0x555555780c20, argc=2, argv=0x7fffffffe698) at /home/daenzer/src/piglit-git/piglit/tests/util/piglit-framework-gl/piglit_winsys_framework.c:73
> #31 0x00007ffff7b1de6c in piglit_gl_test_run (argc=2, argv=0x7fffffffe698, config=0x7fffffffe560) at /home/daenzer/src/piglit-git/piglit/tests/util/piglit-framework-gl.c:203
> #32 0x000055555555b01c in main (argc=2, argv=0x7fffffffe698) at /home/daenzer/src/piglit-git/piglit/tests/shaders/shader_runner.c:60
>
>


More information about the mesa-dev mailing list