<div dir="ltr"><br><div class="gmail_extra"><br><div class="gmail_quote">On Fri, Jan 22, 2016 at 5:35 PM, Matt Turner <span dir="ltr"><<a href="mailto:mattst88@gmail.com" target="_blank">mattst88@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span class="">On Thu, Jan 21, 2016 at 4:37 PM, Kenneth Graunke <<a href="mailto:kenneth@whitecape.org">kenneth@whitecape.org</a>> wrote:<br>
> nir_lower_alu_to_scalar() and nir_lower_load_const_to_scalar()<br>
> handle most cases quite well. They also create nir_ssa_defs rather<br>
> than ir_variables, which are much less memory intensive.<br>
><br>
> This can mean losing out on a few GLSL IR optimizations, however.<br>
> In most cases, this is fine. But a few cases still benefit:<br>
><br>
> - add/mul/dot still benefit from opt_algebraic()'s constant<br>
> reassociation capabilities.<br>
><br>
> - min/max still benefit from opt_minmax().<br>
><br>
> - comparisons seem to still benefit from opt_algebraic(), even<br>
> though we also do most of them in nir_opt_algebraic_late().<br>
><br>
> With this change, shader-db statistics on Skylake are:<br>
><br>
> total instructions in shared programs: 9107924 -> 9107347 (-0.01%)<br>
> instructions in affected programs: 188830 -> 188253 (-0.31%)<br>
> helped: 572<br>
> HURT: 154<br>
<br>
</span>I tried looking at the hurt programs. The first was<br>
guacamelee/368.shader_test, which went from 40 -> 43 instructions. It<br>
has a multiplication tree of (x * (y * 0.2) * 15.0), which we are able<br>
to convert into x * y * 3.0 after channel expressions.<br>
<br>
The shader looks like<br>
<br>
star_1 = (thing0 * 0.2) * thing1<br>
star_1 = star_1 * 15.0<br>
<br>
star_2 = (thing2 * 0.2) * thing3<br>
star_2 = star_2 * 15.0<br>
<br>
star_3 = (thing4 * 0.2) * thing5<br>
star_3 = star_3 * 15.0<br>
<br>
stars = sat(star_1) + sat(star_2) + sat(stars_3)<br>
<br>
Before linking and channel expressions, we don't have appropriate<br>
trees to see that we can reassociate constants. Tree grafting won't do<br>
its job on, for instance, star_1, because it sees that star_1 is<br>
assigned more than once -- even though it's just reassigning itself.<br>
<br>
Adding ir_unop_saturate back to channel_expressions solves the issue.<br>
I continued looking at the regressions and adding things back until I<br>
was down to 25 hurt programs (but also down to only 10 helped) after<br>
I'd readded sign, lrp, neg, f2i, i2f, exp2, log2, f2u, u2f.<br></blockquote><div><br></div><div>Adding those back in gets rid of hurt programs, but does adding them back in get rid of helped programs?<br></div><div>--Jason<br></div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<br>
I think that by the time I was down to zero regressions, the only<br>
things I would have not readded would be opcodes unused in shader-db.<br>
<br>
I think I want to hold off on this patch.<br>
<div class="HOEnZb"><div class="h5">_______________________________________________<br>
mesa-dev mailing list<br>
<a href="mailto:mesa-dev@lists.freedesktop.org">mesa-dev@lists.freedesktop.org</a><br>
<a href="http://lists.freedesktop.org/mailman/listinfo/mesa-dev" rel="noreferrer" target="_blank">http://lists.freedesktop.org/mailman/listinfo/mesa-dev</a><br>
</div></div></blockquote></div><br></div></div>