[Mesa-dev] [PATCH 4/4] nir: link time opt duplicate varyings

Jason Ekstrand jason at jlekstrand.net
Tue Dec 18 04:38:39 UTC 2018


On Mon, Dec 10, 2018 at 10:28 PM Timothy Arceri <tarceri at itsqueeze.com>
wrote:

> If we are outputting the same value to more than one output
> component rewrite the inputs to read from a single component.
>
> This will allow the duplicate varying components to be optimised
> away by the existing opts.
>
> shader-db results i965 (SKL):
>
> total instructions in shared programs: 12869230 -> 12860886 (-0.06%)
> instructions in affected programs: 322601 -> 314257 (-2.59%)
> helped: 3080
> HURT: 8
>
> total cycles in shared programs: 317792574 -> 317730593 (-0.02%)
> cycles in affected programs: 2584925 -> 2522944 (-2.40%)
> helped: 2975
> HURT: 477
>
> shader-db results radeonsi (VEGA):
>
> Totals from affected shaders:
> SGPRS: 30960 -> 31056 (0.31 %)
> VGPRS: 17052 -> 16672 (-2.23 %)
> Spilled SGPRs: 184 -> 167 (-9.24 %)
> Spilled VGPRs: 0 -> 0 (0.00 %)
> Private memory VGPRs: 0 -> 0 (0.00 %)
> Scratch size: 0 -> 0 (0.00 %) dwords per thread
> Code Size: 562532 -> 549404 (-2.33 %) bytes
> LDS: 0 -> 0 (0.00 %) blocks
> Max Waves: 6011 -> 6110 (1.65 %)
> Wait states: 0 -> 0 (0.00 %)
>
> vkpipeline-db results RADV (VEGA):
>
> Totals from affected shaders:
> SGPRS: 14880 -> 15080 (1.34 %)
> VGPRS: 10872 -> 10888 (0.15 %)
> Spilled SGPRs: 0 -> 0 (0.00 %)
> Spilled VGPRs: 0 -> 0 (0.00 %)
> Private memory VGPRs: 0 -> 0 (0.00 %)
> Scratch size: 0 -> 0 (0.00 %) dwords per thread
> Code Size: 674016 -> 668396 (-0.83 %) bytes
> LDS: 0 -> 0 (0.00 %) blocks
> Max Waves: 2708 -> 2704 (-0.15 %)
> Wait states: 0 -> 0 (0.00 %
> ---
>  src/compiler/nir/nir_linking_helpers.c | 95 ++++++++++++++++++++++++++
>  1 file changed, 95 insertions(+)
>
> diff --git a/src/compiler/nir/nir_linking_helpers.c
> b/src/compiler/nir/nir_linking_helpers.c
> index 37644d339f..bdfa7b8c4d 100644
> --- a/src/compiler/nir/nir_linking_helpers.c
> +++ b/src/compiler/nir/nir_linking_helpers.c
> @@ -700,6 +700,27 @@ nir_link_xfb_varyings(nir_shader *producer,
> nir_shader *consumer)
>     }
>  }
>
> +static nir_variable *
> +get_matching_input_var(nir_shader *consumer, nir_variable *out_var)
> +{
> +   nir_variable *input_var = NULL;
> +   nir_foreach_variable(var, &consumer->inputs) {
> +      if (var->data.location >= VARYING_SLOT_VAR0 &&
> +          var->data.location - VARYING_SLOT_VAR0 < MAX_VARYING) {
>

Why is this not redundant with the checks below?  If it's not redundant,
I'd add a comment and make this one a continue because it's obviously
skipping over something.  Looking at patch 3 and the current code, I'm very
sure this isn't actually needed.  Maybe just assert that the above is true
of out_var->data.location and just let the check below handle it for var?


> +
> +         if (var->data.location == out_var->data.location &&
> +             var->data.location_frac == out_var->data.location_frac &&
> +             var->data.interpolation == out_var->data.interpolation &&
> +             get_interp_loc(var) == get_interp_loc(out_var)) {
> +            input_var = var;
>

return var; ?


> +            break;
> +         }
> +      }
> +   }
> +
> +   return input_var;
> +}
> +
>  static bool
>  can_replace_varying(nir_variable *out_var, nir_intrinsic_instr
> *store_intr)
>  {
> @@ -782,6 +803,57 @@ try_replace_constant_input(nir_shader *shader,
>     return progress;
>  }
>
> +static bool
> +try_replace_duplicate_input(nir_shader *shader, nir_variable *input_var,
> +                            nir_intrinsic_instr *dup_store_intr)
> +{
> +   assert(input_var);
> +
> +   nir_variable *dup_out_var =
> +
> nir_deref_instr_get_variable(nir_src_as_deref(dup_store_intr->src[0]));
> +
> +   if (!can_replace_varying(dup_out_var, dup_store_intr))
> +      return false;
>

This isn't much of a "try" function if it's just got one single check at
the top.


> +
> +   nir_function_impl *impl = nir_shader_get_entrypoint(shader);
> +
> +   nir_builder b;
> +   nir_builder_init(&b, impl);
> +
> +   bool progress = false;
> +   nir_foreach_block(block, impl) {
> +      nir_foreach_instr(instr, block) {
> +         if (instr->type != nir_instr_type_intrinsic)
> +            continue;
> +
> +         nir_intrinsic_instr *intr = nir_instr_as_intrinsic(instr);
> +         if (intr->intrinsic != nir_intrinsic_load_deref)
> +            continue;
> +
> +         nir_variable *in_var =
> +            nir_deref_instr_get_variable(nir_src_as_deref(intr->src[0]));
>

Please do something like this instead:

nir_deref_instr *in_deref = nir_src_as_deref(instr->src[0]);
if (in_deref->mode != nir_var_shader_in)
   continue;

nir_variable *in_var = nir_deref_instr_get_variable(in_deref);

By checking the mode on the deref instead of the variable, the pass will
work even in the presence of derefs which you can't chase back to the
variable as long as they aren't on an input.  For some of my UBO/SSBO deref
stuff, I've got patches that make exactly this change for about four other
linking passes.


> +
> +         if (in_var->data.mode != nir_var_shader_in)
> +            continue;
> +
> +         if (in_var->data.location != dup_out_var->data.location ||
> +             in_var->data.location_frac !=
> dup_out_var->data.location_frac ||
> +             in_var->data.interpolation != input_var->data.interpolation
> ||
> +             get_interp_loc(in_var) != get_interp_loc(input_var))
> +            continue;
> +
> +         b.cursor = nir_before_instr(instr);
> +
> +         nir_ssa_def *load = nir_load_var(&b, input_var);
> +         nir_ssa_def_rewrite_uses(&intr->dest.ssa, nir_src_for_ssa(load));
> +
> +         progress = true;
> +      }
> +   }
> +
> +   return progress;
> +}
> +
>  bool
>  nir_link_opt_varyings(nir_shader *producer, nir_shader *consumer)
>  {
> @@ -795,6 +867,10 @@ nir_link_opt_varyings(nir_shader *producer,
> nir_shader *consumer)
>
>     nir_function_impl *impl = nir_shader_get_entrypoint(producer);
>
> +   struct hash_table *varying_values =
> +      _mesa_hash_table_create(NULL,  _mesa_hash_pointer,
> +                              _mesa_key_pointer_equal);
> +
>     /* If we find a store in the last block of the producer we can be sure
> this
>      * is the only possible value for this output.
>      */
> @@ -809,11 +885,30 @@ nir_link_opt_varyings(nir_shader *producer,
> nir_shader *consumer)
>           continue;
>

In order to sort out the deref problems, I think you want something like
this here:

nir_deref_instr *out_deref = nir_src_as_deref(instr->src[0]);
if (out_deref->mode != nir_var_shader_out)
   continue;

I also can't help the feeling that this whole block below feels a bit
sloppy to me.  I think the whole mess would be greatly improved if, now
that we have can_replace_varying, we just did this here:

nir_variable *out_var = nir_deref_instr_get_variable(out_deref);
if (!can_replace_varying(out_var))
   continue;

Then we could get rid of the can_replace_varying call below;
replace_constant_input is no longer a "try" and replace_duplicate_input
doesn't have to be a "try" either.


>        if (intr->src[1].ssa->parent_instr->type !=
> nir_instr_type_load_const) {
> +         struct hash_entry *entry =
> +               _mesa_hash_table_search(varying_values, intr->src[1].ssa);
> +         if (entry) {
> +            progress |=
> +               try_replace_duplicate_input(consumer,
> +                                           (nir_variable *) entry->data,
> +                                           intr);
> +         } else {
> +            nir_variable *out_var =
> +
>  nir_deref_instr_get_variable(nir_src_as_deref(intr->src[0]));
>

This is also going to blow up if nir_deref_instr_get_variable returns
NULL.  See just above.


> +            nir_variable *in_var = get_matching_input_var(consumer,
> out_var);
> +            if (in_var && can_replace_varying(out_var, intr)) {
> +               _mesa_hash_table_insert(varying_values, intr->src[1].ssa,
> +                                       in_var);
> +            }
> +         }
> +
>           continue;
>        }
>
>        progress |= try_replace_constant_input(consumer, intr);
>

Assuming the can_replace_varying() continue above, I think this would be
much cleaner as:

if (nir_src_is_const(instr->src[1])) {
   replace_constant_input(consumer, intr);
   progress = true;
   continue;
}

entry = _mesa_hash_table_search(varying_values, intr->src[1].ssa);
if (entry) {
   replace_duplicate_input(consumer, entry->data, intr);
   progress = true;
   continue;
} else {
   nir_variable *in_var = get_matching_input_var(consumer, out_var);
   if (in_var)
      _mesa_hash_table_insert(varying_values, intr->src[1].ssa, in_var);
}

Is all this actually better or am I just making hash of things for no
reason?  I seriously did not intend to leave that many comments.  I'm
sorry. :-(

--Jason


>     }
>
> +   _mesa_hash_table_destroy(varying_values, NULL);
> +
>     return progress;
>  }
> --
> 2.19.2
>
> _______________________________________________
> mesa-dev mailing list
> mesa-dev at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/mesa-dev/attachments/20181217/128d6f02/attachment-0001.html>


More information about the mesa-dev mailing list