[Mesa-dev] [PATCH 1/2] i965: Use nir_lower_load_const_to_scalar().
Kenneth Graunke
kenneth at whitecape.org
Sat Jan 23 00:19:50 PST 2016
On Friday, January 22, 2016 11:54:22 PM PST Jason Ekstrand wrote:
> On Jan 21, 2016 4:37 PM, "Kenneth Graunke" <kenneth at whitecape.org> wrote:
> >
> > I don't know why, but we never hooked up this pass Eric wrote.
> > Otherwise, you can end up with stupid scalarized code such as:
> >
> > vec4 ssa_7 = load_const (0.0, 0.0, 0.0, 0.0)
> > vec4 ssa_8 = ...
> > vec1 ssa_9 = feq ssa_8, ssa_7
> > vec1 ssa_10 = feq ssa_8.y, ssa_7.y
> > vec1 ssa_11 = feq ssa_8, ssa_7.z
> > vec1 ssa_12 = feq ssa_8.y, ssa_7.w
> >
> > ssa_8.xyxy == <0, 0, 0, 0> should only take two feq instructions.
> >
> > shader-db on Skylake:
> >
> > total instructions in shared programs: 9111788 -> 9111384 (-0.00%)
> > instructions in affected programs: 32421 -> 32017 (-1.25%)
> > helped: 277
> > HURT: 69
> >
> > total cycles in shared programs: 69221226 -> 69219394 (-0.00%)
> > cycles in affected programs: 917796 -> 915964 (-0.20%)
> > helped: 317
> > HURT: 408
> >
> > This also prevents regressions when disabling channel expressions.
> >
> > Signed-off-by: Kenneth Graunke <kenneth at whitecape.org>
> > ---
> > src/mesa/drivers/dri/i965/brw_nir.c | 5 +++++
> > 1 file changed, 5 insertions(+)
> >
> > diff --git a/src/mesa/drivers/dri/i965/brw_nir.c
> b/src/mesa/drivers/dri/i965/brw_nir.c
> > index 935529a..ce9b9db 100644
> > --- a/src/mesa/drivers/dri/i965/brw_nir.c
> > +++ b/src/mesa/drivers/dri/i965/brw_nir.c
> > @@ -482,6 +482,11 @@ brw_preprocess_nir(nir_shader *nir, bool is_scalar)
> >
> > nir = nir_optimize(nir, is_scalar);
> >
> > + if (is_scalar) {
> > + OPT_V(nir_lower_load_const_to_scalar);
> > + OPT(nir_opt_cse);
> > + }
>
> Why did you choose to put this *after* the opt loop? It seems like we
> would want it before so that we can use better. As long as alu_to_scalar
> is run before constant folding (so we don't end up re-vectorizing them),
> before should be fine.
It's after the first invocation of the optimization loop, but before the
second (which isn't obvious from the diff).
I wanted it after nir_lower_vars_to_ssa so that constant initializers
for variables actually get turned into load_const instructions first.
Otherwise, we miss splitting them altogether.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: This is a digitally signed message part.
URL: <http://lists.freedesktop.org/archives/mesa-dev/attachments/20160123/dc052f9b/attachment.sig>
More information about the mesa-dev
mailing list