[Mesa-dev] [PATCH 2/2] i965: Do channel expressions on significantly fewer opcodes.

Fri Jan 22 17:35:44 PST 2016

On Thu, Jan 21, 2016 at 4:37 PM, Kenneth Graunke <kenneth at whitecape.org> wrote:
> nir_lower_alu_to_scalar() and nir_lower_load_const_to_scalar()
> handle most cases quite well.  They also create nir_ssa_defs rather
> than ir_variables, which are much less memory intensive.
>
> This can mean losing out on a few GLSL IR optimizations, however.
> In most cases, this is fine.  But a few cases still benefit:
>
> - add/mul/dot still benefit from opt_algebraic()'s constant
>   reassociation capabilities.
>
> - min/max still benefit from opt_minmax().
>
> - comparisons seem to still benefit from opt_algebraic(), even
>   though we also do most of them in nir_opt_algebraic_late().
>
> With this change, shader-db statistics on Skylake are:
>
> total instructions in shared programs: 9107924 -> 9107347 (-0.01%)
> instructions in affected programs: 188830 -> 188253 (-0.31%)
> helped: 572
> HURT: 154

I tried looking at the hurt programs. The first was
guacamelee/368.shader_test, which went from 40 -> 43 instructions. It
has a multiplication tree of (x * (y * 0.2) * 15.0), which we are able
to convert into x * y * 3.0 after channel expressions.

The shader looks like

star_1 = (thing0 * 0.2) * thing1
star_1 = star_1 * 15.0

star_2 = (thing2 * 0.2) * thing3
star_2 = star_2 * 15.0

star_3 = (thing4 * 0.2) * thing5
star_3 = star_3 * 15.0

stars = sat(star_1) + sat(star_2) + sat(stars_3)

Before linking and channel expressions, we don't have appropriate
trees to see that we can reassociate constants. Tree grafting won't do
its job on, for instance, star_1, because it sees that star_1 is
assigned more than once -- even though it's just reassigning itself.

Adding ir_unop_saturate back to channel_expressions solves the issue.
I continued looking at the regressions and adding things back until I
was down to 25 hurt programs (but also down to only 10 helped) after
I'd readded sign, lrp, neg, f2i, i2f, exp2, log2, f2u, u2f.

I think that by the time I was down to zero regressions, the only
things I would have not readded would be opcodes unused in shader-db.

I think I want to hold off on this patch.