<div dir="ltr"><div class="gmail_quote"><div dir="ltr">On Wed, Jan 2, 2019 at 9:43 AM Ilia Mirkin <<a href="mailto:imirkin@alum.mit.edu">imirkin@alum.mit.edu</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Have a look at the first 4 patches in the series from Jonathan Marek<br> to address some of these issues:<br> <br> <a href="https://patchwork.freedesktop.org/series/54295/" rel="noreferrer" target="_blank">https://patchwork.freedesktop.org/series/54295/</a><br> <br> Not sure exactly what state that work is in, but I've added Jonathan<br> to CC, perhaps he can provide an update.<br> <br> Cheers,<br> <br> -ilia<br> <br> On Wed, Jan 2, 2019 at 6:28 AM Qiang Yu <<a href="mailto:yuq825@gmail.com" target="_blank">yuq825@gmail.com</a>> wrote:<br> ><br> > Hi guys,<br> ><br> > I found the problem with this test fragment shader when lima development:<br> > uniform int color;<br> > void main() {<br> > if (color > 1)<br> > gl_FragColor = vec4(1.0, 0.0, 0.0, 1);<br> > else<br> > gl_FragColor = vec4(0.0, 1.0, 0.0, 1);<br> > }<br> ><br> > nir_print_shader output:<br> > impl main {<br> > block block_0:<br> > /* preds: */<br> > vec1 32 ssa_0 = load_const (0x00000001 /* 0.000000 */)<br> > vec4 32 ssa_1 = load_const (0x3f800000 /* 1.000000 */,<br> > 0x00000000 /* 0.000000 */, 0x00000000 /* 0.000000 */, 0x3f800000 /*<br> > 1.000000 */)<br> > vec4 32 ssa_2 = load_const (0x00000000 /* 0.000000 */,<br> > 0x3f800000 /* 1.000000 */, 0x00000000 /* 0.000000 */, 0x3f800000 /*<br> > 1.000000 */)<br> > vec1 32 ssa_3 = load_const (0x00000000 /* 0.000000 */)<br> > vec1 32 ssa_4 = intrinsic load_uniform (ssa_3) (0, 1, 0) /*<br> > base=0 */ /* range=1 */ /* component=0 */ /* color */<br> > vec1 32 ssa_5 = slt ssa_0, ssa_4<br> > vec1 32 ssa_6 = fnot ssa_5<br> > vec4 32 ssa_7 = bcsel ssa_6.xxxx, ssa_2, ssa_1<br> > intrinsic store_output (ssa_7, ssa_3) (0, 15, 0) /* base=0 */<br> > /* wrmask=xyzw */ /* component=0 */ /* gl_FragColor */<br> > /* succs: block_1 */<br> > block block_1:<br> > }<br> ><br> > ssa0 is not converted to float when glsl to nir. I see glsl_to_nir.cpp<br> > will create flt/ilt/ult<br> > based on source type for gpu support native integer, but for gpu not<br> > support native<br> > integer, just create slt for all source type. And in<br> > nir_lower_constant_initializers,<br> > there's also no type conversion for integer constant.<br></blockquote><div><br></div><div>This is a generally sticky issue. In NIR, we have no concept of types on SSA values which has proven perfectly reasonable and actually very powerful in a world where integers are supported natively. Unfortunately, it causes significant problems for float-only architectures. There are two general possible solutions:</div><div><br></div><div> 1. convert all integers to floats in glsl_to_nir and prog_to_nir and adjust various lowering/optimization passes to handle nir_compiler_options::supports_native_integers == false.</div><div><br></div><div> 2. Just allow integers all the way through until you get very close to the end and then lower integers to floats at the last possible moment.</div><div><br></div><div>Both of these come with significant issues. With the first approach, there are potentially a lot of passes that will need to be adjusted and it's not 100% clear what to do with UBO offsets and indirect addressing; fortunately, you should be able to disable most of those optimizations to get going so it shouldn't be too bad. The second would be less invasive to NIR because it doesn't require modifying as many passes. However, doing such a lowering would be very tricky to get right primarily because of constants. With everything else, you can just sort of assume that inputs are floats (you lowered, right?) and lower to produce a float output. With constants, however, you don't know whether or not it's an integer that needs lowering. We could, in theory, add an extra bit to load_const to solve this problem but there are also significant problems with doing that so it's not clear it's a good idea.</div><div><br></div><div>I think the patches from Marek (as indicated by ilia) attempt the first approach. If we can do it practically, my suspicion is that the first will work better than the second. However, it will take some experimentation in order to actually determine that.</div><div><br></div><div>--Jason<br></div></div></div>