<html>
<head>
<base href="https://bugs.freedesktop.org/">
</head>
<body><table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Bug ID</th>
<td><a class="bz_bug_link
bz_status_NEW "
title="NEW - Compute shaders generate stupid divides"
href="https://bugs.freedesktop.org/show_bug.cgi?id=98299">98299</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>Compute shaders generate stupid divides
</td>
</tr>
<tr>
<th>Product</th>
<td>Mesa
</td>
</tr>
<tr>
<th>Version</th>
<td>git
</td>
</tr>
<tr>
<th>Hardware</th>
<td>Other
</td>
</tr>
<tr>
<th>OS</th>
<td>All
</td>
</tr>
<tr>
<th>Status</th>
<td>NEW
</td>
</tr>
<tr>
<th>Severity</th>
<td>enhancement
</td>
</tr>
<tr>
<th>Priority</th>
<td>medium
</td>
</tr>
<tr>
<th>Component</th>
<td>Drivers/DRI/i965
</td>
</tr>
<tr>
<th>Assignee</th>
<td>idr@freedesktop.org
</td>
</tr>
<tr>
<th>Reporter</th>
<td>idr@freedesktop.org
</td>
</tr>
<tr>
<th>QA Contact</th>
<td>intel-3d-bugs@lists.freedesktop.org
</td>
</tr></table>
<p>
<div>
<pre>While working on GL_ARB_gpu_shader_int64 support, I noticed that compute
shaders with local_size_x = 1 or local_size_y = 1 can generate dumb divides.
For example,
#extension GL_ARB_gpu_shader_int64 : require
writeonly uniform image2D tex;
layout(local_size_x = 9) in;
uniform uint64_t arg0;
uniform uint64_t arg1;
void main()
{
vec4 tmp_color;
if((arg0 >= arg1))
tmp_color = vec4(1.0, 1.0, 0.0, 1.0);
else
tmp_color = vec4(0.0, 0.0, 1.0, 1.0);
ivec2 coord = ivec2(gl_GlobalInvocationID.xy);
imageStore(tex, coord, tmp_color);
}
generates:
Native code for unnamed compute shader GLSL2
SIMD16 shader: 52 instructions. 0 loops. 326 cycles. 0:0 spills:fills. Promoted
0 constants. Compacted 832 to 592 bytes (29%)
START B0
mov(16) g16<1>UD g0.1<0,1,0>UD { align1 1H
compacted };
mov(16) g18<1>UD g0.6<0,1,0>UD { align1 1H };
mov(16) g2<1>UD 0x00000000UD { align1 1H
compacted };
cmp.ge.f0(16) null<1>UQ g1<0,1,0>UQ g1.1<0,1,0>UQ { align1 1H };
mov(16) g4<1>D 1065353216D { align1 1H };
mov(8) g6<1>UW 0x76543210V { align1 WE_all
1Q };
mov(8) g22<1>UD 0D { align1 WE_all
1Q };
mov(8) g28<1>F 1F { align1 1Q };
mul(16) g14<1>D g16<8,8,1>D 9D { align1 1H
compacted };
(-f0) sel(16) g16<1>UD g2<8,8,1>UD 0x3f800000UD { align1 1H };
(-f0) sel(16) g29<1>UD g4<8,8,1>UD 0x00000000UD { align1 1H };
add(8) g6.8<1>UW g6<8,8,1>UW 0x0008UW { align1 WE_all
1Q };
mov(1) g22.7<1>UD -1D { align1 WE_all
};
mov(8) g25<1>F g16<8,8,1>F { align1 1Q
compacted };
mov(8) g26<1>F g16<8,8,1>F { align1 1Q
compacted };
mov(8) g27<1>F g29<8,8,1>F { align1 1Q
compacted };
mov(16) g2<1>UD g6<8,8,1>UW { align1 1H };
add(16) g4<1>D g2<8,8,1>D g1.5<0,1,0>D { align1 1H
compacted };
math intdiv(8) g6<1>D g4<8,8,1>D 1D { align1 1Q
compacted };
math intdiv(8) g7<1>D g5<8,8,1>D 1D { align1 2Q
compacted };
math intdiv(8) g8<1>D g4<8,8,1>D 9D { align1 1Q
compacted };
math intdiv(8) g9<1>D g5<8,8,1>D 9D { align1 2Q
compacted };
math intmod(8) g10<1>D g6<8,8,1>D 9D { align1 1Q
compacted };
math intmod(8) g11<1>D g7<8,8,1>D 9D { align1 2Q
compacted };
math intmod(8) g12<1>D g8<8,8,1>D 1D { align1 1Q
compacted };
math intmod(8) g13<1>D g9<8,8,1>D 1D { align1 2Q
compacted };
mov.nz.f0(16) null<1>D g10<8,8,1>D { align1 1H };
(+f0) xor.l.f0(16) null<1>D g6<8,8,1>D 9D { align1 1H
compacted };
(+f0) add(16) g10<1>D g10<8,8,1>D 9D { align1 1H
compacted };
add(16) g31<1>D g14<8,8,1>D g10<8,8,1>D { align1 1H
compacted };
mov.nz.f0(16) null<1>D g12<8,8,1>D { align1 1H };
mov(8) g23<1>UD g31<8,8,1>UD { align1 1Q
compacted };
(+f0) xor.l.f0(16) null<1>D g8<8,8,1>D 1D { align1 1H
compacted };
(+f0) add(16) g12<1>D g12<8,8,1>D 1D { align1 1H
compacted };
add(16) g20<1>D g18<8,8,1>D g12<8,8,1>D { align1 1H
compacted };
mov(8) g24<1>UD g20<8,8,1>UD { align1 1Q
compacted };
and(1) a0<1>UD g1.4<0,1,0>UD 0x000000ffUD { align1 WE_all
compacted };
or(1) a0<1>UD a0<0,1,0>UD 0x0e0b5000UD { align1 WE_all
};
send(8) null<1>UW g22<8,8,1>UD a0<0,1,0>UD
dp data 1 indirect {
align1 1Q };
mov(8) g2<1>UD 0D { align1 WE_all
2Q };
mov(8) g3<1>UD g32<8,8,1>UD { align1 2Q
compacted };
mov(8) g4<1>UD g21<8,8,1>UD { align1 2Q
compacted };
mov(8) g5<1>F g17<8,8,1>F { align1 2Q
compacted };
mov(8) g6<1>F g17<8,8,1>F { align1 2Q
compacted };
mov(8) g7<1>F g30<8,8,1>F { align1 2Q
compacted };
mov(8) g8<1>F 1F { align1 2Q };
mov(1) g2.7<1>UD -1D { align1 WE_all
};
and(1) a0<1>UD g1.4<0,1,0>UD 0x000000ffUD { align1 WE_all
};
or(1) a0<1>UD a0<0,1,0>UD 0x0e0b6000UD { align1 WE_all
};
send(8) null<1>UW g2<8,8,1>UD a0<0,1,0>UD
dp data 1 indirect {
align1 2Q };
mov(8) g127<1>UD g0<8,8,1>UD { align1 WE_all
1Q compacted };
send(16) null<1>UW g127<8,8,1>UD
thread_spawner mlen 1 rlen 0 {
align1 WE_all 1H EOT };
END B0
I only spent about 1 minute tracking this down, and it wasn't instantly obvious
what is generating this code. I wanted to make a note of it before I forgot.
:)</pre>
</div>
</p>
<hr>
<span>You are receiving this mail because:</span>
<ul>
<li>You are the QA Contact for the bug.</li>
</ul>
</body>
</html>