[Mesa-dev] [PATCH 5/7] i965: Move load_interpolated_input/barycentric_* intrinsics to the top.

Jason Ekstrand jason at jlekstrand.net
Tue Jul 19 20:12:51 UTC 2016


On Mon, Jul 18, 2016 at 1:26 PM, Kenneth Graunke <kenneth at whitecape.org>
wrote:

> Currently, i965 interpolates all FS inputs at the top of the program.
> This has advantages and disadvantages, but I'd like to keep that policy
> while reworking this code.  We can consider changing it independently.
>
> The next patch will make the compiler generate PLN instructions "on the
> fly", when it encounters an input load intrinsic, rather than doing it
> for all inputs at the start of the program.
>
> To emulate this behavior, we introduce an ugly pass to move all NIR
> load_interpolated_input and payload-based (not interpolator message)
> load_barycentric_* intrinsics to the shader's start block.
>
> This helps avoid regressions in shader-db for cases such as:
>
>    if (...) {
>       ...load some input...
>    } else {
>       ...load that same input...
>    }
>
> which CSE can't handle, because there's no dominance relationship
> between the two loads.  Because the start block dominates all others,
> we can CSE all inputs and emit PLNs exactly once, as we did before.
>
> Ideally, global value numbering would eliminate these redundant loads,
> while not forcing them all the way to the start block.  When that lands,
> we should consider dropping this hacky pass.
>

Ugh... You're probably right that we need to do this but it's ugly...  I
look forward to deleting this code :)


> Again, this pass currently does nothing, as i965 doesn't generate these
> intrinsics yet.  But it will shortly, and I figured I'd separate this
> code as it's relatively self-contained.
>
> Signed-off-by: Kenneth Graunke <kenneth at whitecape.org>
> ---
>  src/mesa/drivers/dri/i965/brw_fs.cpp | 78
> ++++++++++++++++++++++++++++++++++++
>  1 file changed, 78 insertions(+)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp
> b/src/mesa/drivers/dri/i965/brw_fs.cpp
> index ea6616b..94127bc 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
> @@ -6400,6 +6400,83 @@ computed_depth_mode(const nir_shader *shader)
>  }
>
>  /**
> + * Move load_interpolated_input with simple (payload-based) barycentric
> modes
> + * to the top of the program so we don't emit multiple PLNs for the same
> input.
> + *
> + * This works around CSE not being able to handle non-dominating cases
> + * such as:
> + *
> + *    if (...) {
> + *       interpolate input
> + *    } else {
> + *       interpolate the same exact input
> + *    }
> + *
> + * This should be replaced by global value numbering someday.
> + */
> +void
> +move_interpolation_to_top(nir_shader *nir)
> +{
> +   nir_foreach_function(f, nir) {
> +      if (!f->impl)
> +         continue;
> +
> +      nir_builder b;
> +      nir_builder_init(&b, f->impl);
> +      b.cursor = nir_before_block(nir_start_block(f->impl));
> +
> +      nir_foreach_block(block, f->impl) {
> +         nir_foreach_instr_safe(instr, block) {
> +            if (instr->type != nir_instr_type_intrinsic)
> +               continue;
> +
> +            nir_intrinsic_instr *load = nir_instr_as_intrinsic(instr);
> +            if (load->intrinsic != nir_intrinsic_load_interpolated_input)
> +               continue;
> +
> +            nir_intrinsic_instr *bary =
> +               nir_instr_as_intrinsic(load->src[0].ssa->parent_instr);
> +
> +            /* Leave interpolateAtSample/Offset() where it is. */
> +            if (bary->intrinsic ==
> nir_intrinsic_load_barycentric_at_sample ||
> +                bary->intrinsic ==
> nir_intrinsic_load_barycentric_at_offset)
> +               continue;
> +
> +            /* Make a new load_barycentric_* intrinsic at the top */
> +            nir_ssa_def *top_bary =
> +               nir_load_barycentric(&b, bary->intrinsic,
> +                                    nir_intrinsic_interp_mode(bary));
> +
> +            /* Make a new load_intrinsic_input at the top */
> +            nir_intrinsic_instr *top_load =
> nir_intrinsic_instr_create(nir,
> +               nir_intrinsic_load_interpolated_input);
> +            top_load->num_components = load->num_components;
> +            top_load->src[0] = nir_src_for_ssa(top_bary);
> +            /* We don't support indirects today - otherwise we might not
> +             * be able to move this to the top. add_const_offset_to_base
> +             * guarantees the offset will be 0.
> +             */
> +            assert(nir_src_as_const_value(load->src[1]) &&
> +                   nir_src_as_const_value(load->src[1])->u32[0] == 0);
> +            top_load->src[1] = nir_src_for_ssa(nir_imm_int(&b, 0));
> +            top_load->const_index[0] = load->const_index[0];
> +            top_load->const_index[1] = load->const_index[1];
> +            nir_ssa_dest_init(&top_load->instr, &top_load->dest,
> +                              load->dest.ssa.num_components,
> +                              load->dest.ssa.bit_size, NULL);
> +
> +            nir_ssa_def_rewrite_uses(&load->dest.ssa,
> +
>  nir_src_for_ssa(&top_load->dest.ssa));
> +            nir_builder_instr_insert(&b, &top_load->instr);
>

You're doing way more work than you need to here.  Just save off the start
block at the top of the pass and do exec_node_remove and exec_list_insert
to put it in the top block.

--Jason


> +         }
> +      }
> +      nir_metadata_preserve(f->impl, (nir_metadata)
> +                            ((unsigned) nir_metadata_block_index |
> +                             (unsigned) nir_metadata_dominance));
> +   }
> +}
> +
> +/**
>   * Apply default interpolation settings to FS inputs which don't specify
> any.
>   */
>  static void
> @@ -6506,6 +6583,7 @@ brw_compile_fs(const struct brw_compiler *compiler,
> void *log_data,
>     brw_nir_lower_fs_outputs(shader);
>     if (!key->multisample_fbo)
>        NIR_PASS_V(shader, demote_sample_qualifiers);
> +   NIR_PASS_V(shader, move_interpolation_to_top);
>     shader = brw_postprocess_nir(shader, compiler->devinfo, true);
>
>     /* key->alpha_test_func means simulating alpha testing via discards,
> --
> 2.9.0
>
> _______________________________________________
> mesa-dev mailing list
> mesa-dev at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/mesa-dev/attachments/20160719/7b97fde7/attachment.html>


More information about the mesa-dev mailing list