[Mesa-dev] [PATCH 4/4] nir: detect more induction variables
Thomas Helland
thomashelland90 at gmail.com
Wed Nov 28 07:52:47 UTC 2018
Den ons. 28. nov. 2018 kl. 04:26 skrev Timothy Arceri <tarceri at itsqueeze.com>:
>
> This adds allows loop analysis to detect inductions varibales that
> are incremented in both branches of an if rather than in a main
> loop block. For example:
>
> loop {
> block block_1:
> /* preds: block_0 block_7 */
> vec1 32 ssa_8 = phi block_0: ssa_4, block_7: ssa_20
> vec1 32 ssa_9 = phi block_0: ssa_0, block_7: ssa_4
> vec1 32 ssa_10 = phi block_0: ssa_1, block_7: ssa_4
> vec1 32 ssa_11 = phi block_0: ssa_2, block_7: ssa_21
> vec1 32 ssa_12 = phi block_0: ssa_3, block_7: ssa_22
> vec4 32 ssa_13 = vec4 ssa_12, ssa_11, ssa_10, ssa_9
> vec1 32 ssa_14 = ige ssa_8, ssa_5
> /* succs: block_2 block_3 */
> if ssa_14 {
> block block_2:
> /* preds: block_1 */
> break
> /* succs: block_8 */
> } else {
> block block_3:
> /* preds: block_1 */
> /* succs: block_4 */
> }
> block block_4:
> /* preds: block_3 */
> vec1 32 ssa_15 = ilt ssa_6, ssa_8
> /* succs: block_5 block_6 */
> if ssa_15 {
> block block_5:
> /* preds: block_4 */
> vec1 32 ssa_16 = iadd ssa_8, ssa_7
> vec1 32 ssa_17 = load_const (0x3f800000 /* 1.000000*/)
> /* succs: block_7 */
> } else {
> block block_6:
> /* preds: block_4 */
> vec1 32 ssa_18 = iadd ssa_8, ssa_7
> vec1 32 ssa_19 = load_const (0x3f800000 /* 1.000000*/)
> /* succs: block_7 */
> }
> block block_7:
> /* preds: block_5 block_6 */
> vec1 32 ssa_20 = phi block_5: ssa_16, block_6: ssa_18
> vec1 32 ssa_21 = phi block_5: ssa_17, block_6: ssa_4
> vec1 32 ssa_22 = phi block_5: ssa_4, block_6: ssa_19
> /* succs: block_1 */
> }
>
> Unfortunatly GCM could move the addition out of the if for us
> (making this patch unrequired) but we still cannot enable the GCM
> pass without regressions.
>
Just some questions / suggestions from my side for now.
I'll try to take a closer look at the patch later today.
While GCM would be nice, to me it seems that adding an
if-opt instead, that pulls common code from both branches
of an if out of the if on a more general basis, would get us
this, plus a bunch of other benefits? As far as I can see there
should never be negative impacts from pulling common code
out like that, but I might be wrong. Did you look into that?
I bet out did, I'm just interested in how that worked out.
Since GCM is not yet where we want it to be, maybe we'd
want to implement LICM? That obviously does not come
into play with what this patch adresses, but it might help
get a more accurate estimate of the cost/benefit of unrolling?
(Invariant computations that will be CSE'd will not be
counted multiple times). This might already be accounted
for by counting the invariant computations only once?
> This unrolls a loop in Rise of The Tomb Raider.
>
> vkpipeline-db results (VEGA):
>
> Totals from affected shaders:
> SGPRS: 88 -> 96 (9.09 %)
> VGPRS: 56 -> 52 (-7.14 %)
> Spilled SGPRs: 0 -> 0 (0.00 %)
> Spilled VGPRs: 0 -> 0 (0.00 %)
> Private memory VGPRs: 0 -> 0 (0.00 %)
> Scratch size: 0 -> 0 (0.00 %) dwords per thread
> Code Size: 2168 -> 4560 (110.33 %) bytes
> LDS: 0 -> 0 (0.00 %) blocks
> Max Waves: 4 -> 4 (0.00 %)
> Wait states: 0 -> 0 (0.00 %)
>
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=32211
> ---
> src/compiler/nir/nir_loop_analyze.c | 36 +++++++++++++++++++++++++++++
> 1 file changed, 36 insertions(+)
>
> diff --git a/src/compiler/nir/nir_loop_analyze.c b/src/compiler/nir/nir_loop_analyze.c
> index 8903e15105..cf97d6bf06 100644
> --- a/src/compiler/nir/nir_loop_analyze.c
> +++ b/src/compiler/nir/nir_loop_analyze.c
> @@ -245,6 +245,42 @@ compute_induction_information(loop_info_state *state)
> if (src_var->in_if_branch || src_var->in_nested_loop)
> break;
>
> + /* Detect inductions varibales that are incremented in both branches
> + * of an unnested if rather than in a loop block.
> + */
> + if (is_var_phi(src_var)) {
> + nir_phi_instr *src_phi =
> + nir_instr_as_phi(src_var->def->parent_instr);
> +
> + nir_op alu_op;
> + nir_ssa_def *alu_srcs[2] = {0};
> + nir_foreach_phi_src(src2, src_phi) {
> + nir_loop_variable *src_var2 =
> + get_loop_var(src2->src.ssa, state);
> +
> + if (!src_var2->in_if_branch || !is_var_alu(src_var2))
> + break;
> +
> + nir_alu_instr *alu =
> + nir_instr_as_alu(src_var2->def->parent_instr);
> + if (nir_op_infos[alu->op].num_inputs != 2)
> + break;
> +
> + if (alu->src[0].src.ssa == alu_srcs[0] &&
> + alu->src[1].src.ssa == alu_srcs[1] &&
> + alu->op == alu_op) {
> + /* Both branches perform the same calculation so we can use
> + * one of them to find the induction variable.
> + */
> + src_var = src_var2;
> + } else {
> + alu_srcs[0] = alu->src[0].src.ssa;
> + alu_srcs[1] = alu->src[1].src.ssa;
> + alu_op = alu->op;
> + }
> + }
> + }
> +
> if (!src_var->in_loop) {
> biv->def_outside_loop = src_var;
> } else if (is_var_alu(src_var)) {
> --
> 2.19.1
>
> _______________________________________________
> mesa-dev mailing list
> mesa-dev at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
More information about the mesa-dev
mailing list