[Mesa-dev] [PATCH 08/10] nir: add a loop unrolling pass

Timothy Arceri t_arceri at yahoo.com.au
Sat Dec 10 04:54:03 UTC 2016


On Fri, 2016-12-09 at 17:23 -0800, Jason Ekstrand wrote:
> Wow!  This is way better than the last time I read through it.  Good
> work!
> 
> Overall, I'm much happier with the code now.  The structure is
> better, some of the crazy phi logic is gone, and clone_cf_list is
> helping a lot.  That said... I still have a pile of comments.  Most
> of them are cosmetic, one is a bug, and one is a suggestion for how
> to make things simpler.  I think fixing some of the cosmetic stuff,
> especially in unroll_complex, will help with readability a lot.
> 
> The suggestion, which I'd like to highlight here, was to take
> advantage of nir_repair_ssa and see if we can git rid of a bunch of
> the phi node pain from unroll_complex.  Most of the pain in that part
> of the pass appears to be in trying to deal with the phi nodes at the
> end of the pass and put everything back where it belongs.  The
> repair_ssa pass (which I think I wrote after you started on this
> project) is a tiny pass that takes a shader that is in "mostly" ssa
> form and makes it into "proper" NIR ssa.  The only requirement on the
> shader is tha the sources of the phi nodes point to the correct SSA
> values.  Dominance doesn't have to hold and you don't have to have
> the whole ladder of phi nodes so long as the data flow is correct.  I
> *think* based on my sketchy reading of things, that we should be able
> to just apply the final remap table to each of the phis after the
> loop and then just run repair_ssa on the shader once we're done
> unrolling.
> 
> On Mon, Dec 5, 2016 at 5:12 PM, Timothy Arceri <timothy.arceri at collab
> ora.com> wrote:
> > V2:
> > - tidy ups suggested by Connor.
> > - tidy up cloning logic and handle copy propagation
> >  based of suggestion by Connor.
> > - use nir_ssa_def_rewrite_uses to fix up lcssa phis
> >   suggested by Connor.
> > - add support for complex loop unrolling (two terminators)
> > - handle case were the ssa defs use outside the loop is already a
> > phi
> > - support unrolling loops with multiple terminators when trip count
> >   is know for each terminator
> > 
> > V3:
> > - set correct num_components when creating phi in complex unroll
> > - rewrite update remap table based on Jasons suggestions.
> > - remove unrequired extract_loop_body() helper as suggested by
> > Jason.
> > - simplify the lcssa phi fix up code for simple loops as per Jasons
> > suggestions.
> > - use mem context to keep track of hash table memory as suggested
> > by Jason.
> > - move is_{complex,simple}_loop helpers to the unroll code
> > - require nir_metadata_block_index
> > - partially rewrote complex unroll to be simpler and easier to
> > follow.
> > 
> > V4:
> > - use rzalloc() when creating nir_phi_src but not setting pred
> > right away
> >  fixes regression cause by ralloc() no longer zeroing memory.
> > ---
> >  src/compiler/Makefile.sources          |   1 +
> >  src/compiler/nir/nir.h                 |   2 +
> >  src/compiler/nir/nir_opt_loop_unroll.c | 729
> > +++++++++++++++++++++++++++++++++
> >  3 files changed, 732 insertions(+)
> >  create mode 100644 src/compiler/nir/nir_opt_loop_unroll.c
> > 
> > diff --git a/src/compiler/Makefile.sources
> > b/src/compiler/Makefile.sources
> > index d3e158a..799fb38 100644
> > --- a/src/compiler/Makefile.sources
> > +++ b/src/compiler/Makefile.sources
> > @@ -238,6 +238,7 @@ NIR_FILES = \
> >         nir/nir_opt_dead_cf.c \
> >         nir/nir_opt_gcm.c \
> >         nir/nir_opt_global_to_local.c \
> > +       nir/nir_opt_loop_unroll.c \
> >         nir/nir_opt_peephole_select.c \
> >         nir/nir_opt_remove_phis.c \
> >         nir/nir_opt_undef.c \
> > diff --git a/src/compiler/nir/nir.h b/src/compiler/nir/nir.h
> > index d948e97..b8813e4 100644
> > --- a/src/compiler/nir/nir.h
> > +++ b/src/compiler/nir/nir.h
> > @@ -2573,6 +2573,8 @@ bool nir_opt_dead_cf(nir_shader *shader);
> > 
> >  bool nir_opt_gcm(nir_shader *shader, bool value_number);
> > 
> > +bool nir_opt_loop_unroll(nir_shader *shader, nir_variable_mode
> > indirect_mask);
> > +
> >  bool nir_opt_peephole_select(nir_shader *shader, unsigned limit);
> > 
> >  bool nir_opt_remove_phis(nir_shader *shader);
> > diff --git a/src/compiler/nir/nir_opt_loop_unroll.c
> > b/src/compiler/nir/nir_opt_loop_unroll.c
> > new file mode 100644
> > index 0000000..8715757
> > --- /dev/null
> > +++ b/src/compiler/nir/nir_opt_loop_unroll.c
> > @@ -0,0 +1,729 @@
> > +/*
> > + * Copyright © 2016 Intel Corporation
> > + *
> > + * Permission is hereby granted, free of charge, to any person
> > obtaining a
> > + * copy of this software and associated documentation files (the
> > "Software"),
> > + * to deal in the Software without restriction, including without
> > limitation
> > + * the rights to use, copy, modify, merge, publish, distribute,
> > sublicense,
> > + * and/or sell copies of the Software, and to permit persons to
> > whom the
> > + * Software is furnished to do so, subject to the following
> > conditions:
> > + *
> > + * The above copyright notice and this permission notice
> > (including the next
> > + * paragraph) shall be included in all copies or substantial
> > portions of the
> > + * Software.
> > + *
> > + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
> > EXPRESS OR
> > + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
> > MERCHANTABILITY,
> > + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO
> > EVENT SHALL
> > + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM,
> > DAMAGES OR OTHER
> > + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
> > ARISING
> > + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
> > OTHER
> > + * DEALINGS IN THE SOFTWARE.
> > + */
> > +
> > +#include "nir.h"
> > +#include "nir_builder.h"
> > +#include "nir_control_flow.h"
> > +
> > +static bool
> > +is_loop_small_enough_to_unroll(nir_shader *shader, nir_loop_info
> > *li)
> > +{
> > +   unsigned max_iter = shader->options->max_unroll_iterations;
> > +
> > +   if (li->trip_count > max_iter)
> > +      return false;
> > +
> > +   if (li->force_unroll)
> > +      return true;
> > +
> > +   bool loop_not_too_large =
> > +      li->num_instructions * li->trip_count <= max_iter * 25;
> > +
> > +   return loop_not_too_large;
> > +}
> > +
> > +static bool
> > +is_complex_loop(nir_shader *shader, nir_loop_info *li)
> 
> is_complex_loop is kind-of a lie here since it also decides whether
> or not to unroll...
>  
> > +{
> > +   unsigned num_lt = list_length(&li->loop_terminator_list);
> > +   return is_loop_small_enough_to_unroll(shader, li) && num_lt ==
> > 2;
> > +}
> > +
> > +static bool
> > +is_simple_loop(nir_shader *shader, nir_loop_info *li)
> 
> Same here.
>  
> > +{
> > +   return li->is_trip_count_known &&
> > +          is_loop_small_enough_to_unroll(shader, li);
> > +}
> > +
> > +static void
> > +move_cf_list_into_if(nir_cf_list *lst, nir_cf_node *if_node,
> > +                     nir_block *last_blk, bool
> > continue_from_then_branch)
> > +{
> > +   nir_if *if_stmt = nir_cf_node_as_if(if_node);
> > +   if (continue_from_then_branch) {
> > +      /* Move the rest of the loop inside the then */
> > +      nir_cf_reinsert(lst,
> > nir_after_block(nir_if_last_then_block(if_stmt))); 
> > +   } else {
> > +      /* Move the rest of the loop inside the else */
> > +      nir_cf_reinsert(lst,
> > nir_after_block(nir_if_last_else_block(if_stmt)));
> > +   }
> > +
> > +   /* Remove the break */
> > +   nir_instr_remove(nir_block_last_instr(last_blk));
> > +}
> > +
> > +static void
> > +create_remap_tables(void *mem_ctx, nir_loop *loop, nir_block
> > *loop_header_blk,
> > +                    struct hash_table **remap_table,
> > +                    struct hash_table **phi_remap)
> > +{
> > +   *remap_table = _mesa_hash_table_create(mem_ctx,
> > _mesa_hash_pointer,
> > +                                         
> > _mesa_key_pointer_equal);
> > +   *phi_remap = _mesa_hash_table_create(mem_ctx,
> > _mesa_hash_pointer,
> > +                                        _mesa_key_pointer_equal);
> > +
> > +   /* Build table to be used for remapping as we unroll. */
> > +   nir_foreach_instr(instr, loop_header_blk) {
> > +      if (instr->type != nir_instr_type_phi)
> > +         break;
> > +
> > +      nir_phi_instr *phi = nir_instr_as_phi(instr);
> > +      nir_foreach_phi_src(src, phi) {
> > +         /* Add src to remap table if it's pred is from before the
> > loop, these
> > +          * srcs will be used in the first iteration of cloning
> > the loop.
> > +          */
> > +         if (src->pred->index < phi->instr.block->index) {
> > +            _mesa_hash_table_insert(*remap_table, &phi->dest.ssa,
> > +                                    src->src.ssa);
> > +         }
> > +      }
> > +   }
> > +}
> > +
> > +static void
> > +update_remap_table(nir_loop *loop, struct hash_table *remap_table,
> > +                   struct hash_table *phi_remap)
> > +{
> > +   nir_block *header_block = nir_loop_first_block(loop);
> > +   nir_foreach_instr(instr, header_block) {
> > +
> 
> Unneeded newline
>  
> > +      if (instr->type != nir_instr_type_phi)
> > +         break;
> > +
> > +      nir_phi_instr *phi = nir_instr_as_phi(instr);
> > +      nir_foreach_phi_src(src, phi) {
> > +         if (src->pred->index < phi->instr.block->index)
> > +            continue;
> > +
> > +         struct hash_entry *remap_he =
> > +            _mesa_hash_table_search(remap_table, src->src.ssa);
> > +         if (remap_he) {
> > +            _mesa_hash_table_insert(phi_remap, &phi->dest.ssa,
> > remap_he->data);
> > +         } else {
> > +            _mesa_hash_table_insert(phi_remap, &phi->dest.ssa,
> > src->src.ssa);
> 
> I wonder if this case is ever anything other than a no-op...
>  
> > +         }
> > +      }
> > +   }
> > +
> > +   struct hash_entry *phi_entry;
> > +   hash_table_foreach(phi_remap, phi_entry)
> > +      _mesa_hash_table_insert(remap_table, phi_entry->key,
> > phi_entry->data);
> > +}
> > +
> > +static void
> > +insert_phi_and_set_block_on_uses(nir_builder *b, nir_phi_instr
> > *phi_instr)
> > +{
> > +   nir_instr_insert(b->cursor, &phi_instr->instr);
> > +
> > +   /* Now that we have inserted the phi fix up the block for its
> > uses. */
> > +   nir_foreach_use_safe(use_src, &phi_instr->dest.ssa) {
> > +      nir_phi_instr *use_phi = nir_instr_as_phi(use_src-
> > >parent_instr);
> > +
> > +      foreach_list_typed(nir_phi_src, src, node, &use_phi->srcs) {
> 
> We have a nir_foreach_phi_src helper you could use here.
>  
> > +         if (!src->pred)
> > +            src->pred = phi_instr->dest.ssa.parent_instr->block;
> > +      }
> > +   }
> > +}
> > +
> > +static nir_phi_instr *
> > +create_complex_unroll_phi(nir_shader *ns, nir_phi_instr
> > *prev_phi_instr)
> > +{
> > +   nir_phi_instr *new_phi = nir_phi_instr_create(ns);
> > +   nir_ssa_dest_init(&new_phi->instr, &new_phi->dest,
> > +                     prev_phi_instr->dest.ssa.num_components,
> > +                     prev_phi_instr->dest.ssa.bit_size, NULL);
> > +
> > +   /* Add the new phi as a src to the phi from the previous
> > iteration */
> > +   nir_phi_src *new_src = rzalloc(prev_phi_instr, nir_phi_src);
> > +   new_src->src = nir_src_for_ssa(&new_phi->dest.ssa);
> > +   new_src->src.parent_instr = &prev_phi_instr->instr;
> > +   exec_list_push_tail(&prev_phi_instr->srcs, &new_src->node);
> > +   list_addtail(&new_src->src.use_link, &new_src->src.ssa->uses);
> > +
> > +   return new_phi;
> > +}
> > +
> > +static void
> > +add_complex_unroll_phi_src(nir_ssa_def *phi_src, nir_phi_instr
> > *phi_instr,
> > +                           struct hash_table *remap_table,
> > nir_block *blk)
> > +{
> > +   struct hash_entry *hte =
> > +      _mesa_hash_table_search(remap_table, phi_src);
> > +
> > +   nir_phi_src *new_src = ralloc(phi_instr, nir_phi_src);
> > +   nir_ssa_def *ssa_def = hte ? (nir_ssa_def *) hte->data :
> > phi_src;
> > +   new_src->pred = blk;
> > +   new_src->src = nir_src_for_ssa(ssa_def);
> > +   new_src->src.parent_instr = &phi_instr->instr;
> > +   list_addtail(&new_src->src.use_link, &new_src->src.ssa->uses);
> > +
> > +   exec_list_push_tail(&phi_instr->srcs, &new_src->node);
> > +}
> > +
> > +static void
> > +simple_loop_fix_lcssa_phis(nir_cf_node *loop, struct hash_table
> > *remap_table)
> > +{
> > +   nir_block *prev_block =
> > nir_cf_node_as_block(nir_cf_node_prev(loop));
> > +   nir_cf_node *cf_node = nir_cf_node_next(loop);
> > +   assert(cf_node->type == nir_cf_node_block);
> > +
> > +   nir_block *block = nir_cf_node_as_block(cf_node);
> > +   nir_foreach_instr_safe(instr, block) {
> > +      if (instr->type != nir_instr_type_phi) {
> > +         /* There should be no more phis */
> > +         break;
> > +      }
> 
> I don't think you really need the comment here.  We do enough "if
> (!is_phi) break" that readers should be used to it.
>  
> > +
> > +      /* Simple loops should only ever have a single exit point so
> > we can
> > +       * simply rewrite the phis uses to whatever is in the remap
> > table or
> > +       * the src before the loop (if it had 0 iterations) and be
> > done.
> > +       */
> > +      nir_phi_instr *phi = nir_instr_as_phi(instr);
> 
> Should we assert phi->is_lcssa here?  Or maybe list_is_singular(phi-
> >srcs)?
>  
> > +      nir_foreach_phi_src(src, phi) {
> > +         /* Update predecessor */
> > +         src->pred = prev_block;
> > +
> > +         struct hash_entry *hte =
> > +            _mesa_hash_table_search(remap_table, src->src.ssa);
> > +         if (hte) {
> > +            nir_ssa_def_rewrite_uses(&phi->dest.ssa,
> > +                                     nir_src_for_ssa((nir_ssa_def
> > *) hte->data));
> > +         } else {
> > +            nir_ssa_def_rewrite_uses(&phi->dest.ssa, src->src);
> > +         }
> > +      }
> > +      assert(list_empty(&phi->dest.ssa.uses));
> > +   }
> > +}
> > +
> > +static bool
> > +ends_in_break(nir_block *block)
> > +{
> > +   if (exec_list_is_empty(&block->instr_list))
> > +      return false;
> > +
> > +   nir_instr *instr = nir_block_last_instr(block);
> > +   return instr->type == nir_instr_type_jump &&
> > +      nir_instr_as_jump(instr)->type == nir_jump_break;
> > +}
> 
> It would make sense to move this below simple_unroll since it's not
> used for simple loops and it's siting between simple_unroll and
> simple_loop_fix_lcssa_phis
>  
> > +
> > +/**
> > + * Unroll a loop which does not contain any jumps.  For example,
> > if the input
> > + * is:
> > + *
> > + *     (loop (...) ...instrs...)
> > + *
> > + * And the iteration count is 3, the output will be:
> > + *
> > + *     ...instrs... ...instrs... ...instrs...
> > + */
> > +static void
> > +simple_unroll(nir_loop *loop, nir_builder *b)
> > +{
> > +   void *mem_ctx = ralloc_context(NULL);
> > +
> > +   nir_convert_loop_to_lcssa(loop);
> > +
> > +   /* Get the loop header this contains a bunch of phis and the
> > loops
> > +    * conditional.
> > +    */
> > +   nir_block *header_blk = nir_loop_first_block(loop);
> > +
> > +   struct hash_table *remap_table;
> > +   struct hash_table *phi_remap;
> > +   create_remap_tables(mem_ctx, loop, header_blk, &remap_table,
> > &phi_remap);
> > +
> > +   /* Skip over loop terminator and get the loop body. */
> > +   nir_cf_node *if_node = &loop->info->limiting_terminator->nif-
> > >cf_node;
> > +   list_for_each_entry(nir_loop_terminator, terminator,
> > +                       &loop->info->loop_terminator_list,
> > loop_terminator_link) {
> > +       nir_cf_node *loop_node = &terminator->nif->cf_node;
> 
> loop_node is an odd name here.  It makes me think it's &loop-
> >cf_node.  How about term_node?  Or, for that matter, it looks like
> what you realyly want to compare below is the if's and not the
> cf_nodes' so maybe term_if?
>  
> > +
> > +      /* Remove all but the limiting terminator as we know the
> > other exit
> > +       * conditions can never be met.
> > +       */
> > +      if (loop_node != &loop->info->limiting_terminator->nif-
> > >cf_node) {
> > +         nir_cf_node_remove(loop_node);
> > +      }
> > +   }
> > +
> 
> It might be worth putting a comment in here about how the analysis
> pass bails on anything other than simple "if (foo) break"
> conditions.  If it's pratical to do so, I'd kind-of like to see an
> assert.
>  
> > +   /* Pluck out the loop header */
> > +   nir_cf_list lp_header;
> > +   nir_cf_extract(&lp_header, nir_before_block(header_blk),
> > +                  nir_before_cf_node(if_node));
> > +
> > +   /* Pluck out the loop body */
> > +   nir_cf_list loop_body;
> > +   nir_cf_extract(&loop_body, nir_after_cf_node(if_node),
> > +                  nir_after_block(nir_loop_last_block(loop)));
> > +
> > +   /* Clone the loop header */
> > +   nir_cf_list cloned_header;
> > +   nir_cf_list_clone(&cloned_header, &lp_header, loop-
> > >cf_node.parent,
> > +                     remap_table);
> > +
> > +   /* Insert cloned loop header before the loop */
> > +   b->cursor = nir_before_cf_node(&loop->cf_node);
> > +   nir_cf_reinsert(&cloned_header, b->cursor);
> > +
> > +   /* Temp list to store the cloned loop body as we unroll */
> > +   nir_cf_list unrolled_lp_body;
> > +
> > +   /* Clone loop header and append to the loop body */
> > +   for (unsigned i = 0; i < loop->info->trip_count; i++) {
> > +      /* Clone loop body */
> > +      nir_cf_list_clone(&unrolled_lp_body, &loop_body, loop-
> > >cf_node.parent,
> > +                        remap_table);
> > +
> > +      update_remap_table(loop, remap_table, phi_remap);
> > +
> > +      /* Insert unrolled loop body before the loop */
> > +      b->cursor = nir_before_cf_node(&loop->cf_node);
> > +      nir_cf_reinsert(&unrolled_lp_body, b->cursor);
> > +
> > +      /* Clone loop header */
> > +      nir_cf_list_clone(&cloned_header, &lp_header, loop-
> > >cf_node.parent,
> > +                        remap_table);
> > +
> > +      /* Insert loop header after loop body */
> > +      b->cursor = nir_before_cf_node(&loop->cf_node);
> > +      nir_cf_reinsert(&cloned_header, b->cursor);
> > +   }
> > +
> > +   /* The loop has been unrolled so remove it. */
> > +   simple_loop_fix_lcssa_phis(&loop->cf_node, remap_table);
> > +
> > +   /* Remove the loop */
> > +   nir_cf_node_remove(&loop->cf_node);
> > +
> > +   /* Delete the original loop body & header */
> > +   nir_cf_delete(&lp_header);
> > +   nir_cf_delete(&loop_body);
> > +
> > +   ralloc_free(mem_ctx);
> > +}
> > +
> > +typedef struct {
> > +   /* This field is updated at each interation as we unroll. */
> > +   nir_phi_instr *phi;
> > +
> > +   /* This is the phi src of the loop terminator with an unknown
> > trip count */
> > +   nir_ssa_def *phi_src_def;
> > +
> > +   /* These fields are used to fix things up after we have
> > unrolled. */
> > +   nir_phi_instr *old_loop_phi;
> > +   nir_ssa_def *outermost_phi_def;
> > +   nir_ssa_def *innermost_phi_src_def;
> > +
> > +   struct list_head list_link;
> > +} loop_term_phi_info;
> 
> You seem to be going through a lot of pain to keep track of those
> pesky phi nodes after the loop.  I think this may be a case where we
> just want to let them be what they are and then run nir_repair_ssa
> after loop unrolling.
>  
> > +
> > +/**
> > + * Unroll a loop with two exists when the trip count of one of the
> > exits is
> > + * unknown.  If continue_from_then_branch is true, the loop is
> > repeated only
> > + * when the "then" branch of the if is taken; otherwise it is
> > repeated only
> > + * when the "else" branch of the if is taken.
> > + *
> > + * For example, if the input is:
> > + *
> > + *     (loop (...)
> > + *      ...body...
> > + *      (if (cond)
> > + *          (...then_instrs...)
> > + *        (...else_instrs...)))
> > + *
> > + * And the iteration count is 3, and \c continue_from_then_branch
> > is true,
> > + * then the output will be:
> > + *
> > + *     ...body...
> > + *     (if (cond)
> > + *         (...then_instrs...
> > + *          ...body...
> > + *          (if (cond)
> > + *              (...then_instrs...
> > + *               ...body...
> > + *               (if (cond)
> > + *                   (...then_instrs...)
> > + *                 (...else_instrs...)))
> > + *            (...else_instrs...)))
> > + *       (...else_instrs))
> 
> Mind writing this in nir_print style?
>  
> > + */
> > +static void
> > +complex_unroll(nir_function *fn, nir_loop *loop, nir_builder *b,
> > +               nir_cf_node *if_node, nir_block *last_blk,
> 
> Please replace if_node here with a nir_if that's called unknown_if or
> unlimit_if or something like that.
>  
> > +               bool continue_from_then_branch, bool
> > limiting_term_second)
> > +{
> > +   void *mem_ctx = ralloc_context(NULL);
> > +
> > +   nir_convert_loop_to_lcssa(loop);
> > +
> > +   nir_shader *ns = fn->shader;
> > +   nir_cf_node *limiting_trm = &loop->info->limiting_terminator-
> > >nif->cf_node;
> 
> And make this a nir_if *limit_if
> 
> That way you can keep track of what is what as you read the code. 
> This is code hard enough to dig through without having to chase
> variable names around.
>  
> > +   nir_block *header_blk = nir_loop_first_block(loop);
> > +
> > +   struct hash_table *remap_table;
> > +   struct hash_table *phi_remap;
> > +   create_remap_tables(mem_ctx, loop, header_blk, &remap_table,
> > &phi_remap);
> > +
> > +   struct hash_table *final_term_phis =
> > +      _mesa_hash_table_create(mem_ctx, _mesa_hash_pointer,
> > +                              _mesa_key_pointer_equal);
> > +
> > +   struct list_head loop_term_phis;
> > +   list_inithead(&loop_term_phis);
> > +
> > +   /* Build a list of phis */
> > +   nir_if *loop_term_if = loop->info->limiting_terminator->nif;
> > +   nir_cf_node *after_loop = nir_cf_node_next(&loop->cf_node);
> > +   nir_foreach_instr(instr, nir_cf_node_as_block(after_loop)) {
> > +      if (instr->type != nir_instr_type_phi)
> > +         break;
> > +
> > +      nir_phi_instr *phi = nir_instr_as_phi(instr);
> > +      nir_foreach_phi_src(src, phi) {
> > +         nir_block *then_blk =
> > nir_if_last_then_block(loop_term_if);
> > +         nir_block *else_blk =
> > nir_if_last_else_block(loop_term_if);
> > +
> > +         if (src->pred == then_blk || src->pred == else_blk) {
> > +            _mesa_hash_table_insert(final_term_phis, phi, src-
> > >src.ssa);
> > +         } else {
> > +            /* Add to list of phis with an unknown iteration count
> > */
> > +            loop_term_phi_info *ltpi = ralloc(mem_ctx,
> > loop_term_phi_info);
> > +            ltpi->phi = phi;
> > +            ltpi->old_loop_phi = phi;
> > +            ltpi->phi_src_def = src->src.ssa;
> > +            list_addtail(&ltpi->list_link, &loop_term_phis);
> > +         }
> > +      }
> > +   }
> > +
> > +   if (limiting_term_second) {
> > +      /* We need some special handling when its the second
> > terminator causing
> > +       * us to exit the loop for example:
> > +       *
> > +       *   for (int i = 0; i < uniform_lp_count; i++) {
> > +       *      colour = vec4(0.0, 1.0, 0.0, 1.0);
> > +       *
> > +       *      if (i == 1)
> 
> Missing a "{" to match the "}" below
>  
> > +       *         break;
> > +       *      }
> > +       *      ... any further code is unreachable after i == 1 ...
> > +       *   }
> > +       *
> > +       * Bump the trip count by one so we actually clone
> > something. Also
> > +       * extract everything after the limiting terminator and
> > insert it into
> > +       * the branch we will continue from.
> > +       */
> > +      loop->info->trip_count++;
> 
> I get the rest of this but not why we're bumping the trip count.  I
> mean, the way that complex unrolling works, bumping the trip count
> doesn't hurt but I don't see what it's gaining us either.

Because the trip count is the number of times we pass over the extire
loop before hitting a break. When its the second terminator that exits
we can actually execute code inside the loop when trip count == 0 e.g.
the code above the break so we need to bump the trip_count in order for
the code below to clone anything. When trip count == 1 we execute the
code above the loop twice and the code below it once. So we need trip
count to be bumped to clone things twice.

Maybe I should use a temp here "num_times_to_clone" ?

>  
> > +
> > +      nir_cf_list after_lt;
> > +      nir_cf_extract(&after_lt, nir_after_cf_node(limiting_trm),
> > +                     nir_after_block(nir_loop_last_block(loop)));
> 
> Are we guaranteed that the ifs aren't nested?
>  
> > +
> > +      nir_if *if_stmt = loop->info->limiting_terminator->nif;
> 
> No.  We now have if_stmt and if_node and they point to two completely
> different ifs.  Let's name things better. :-)
>  
> > +      nir_block *last_then = nir_if_last_then_block(if_stmt);
> > +      if (ends_in_break(last_then)) {
> > +         move_cf_list_into_if(&after_lt, limiting_trm, last_then,
> > false);
> > +      } else {
> > +         nir_block *last_else = nir_if_last_else_block(if_stmt);
> > +         if (ends_in_break(last_else)) {
> > +            move_cf_list_into_if(&after_lt, limiting_trm,
> > last_else, true);
> > +         }
> 
> And if neither of them contains the break we throw it away???  I
> think something needs to be asserted here.

Sure the if (ends_in_break(last_else)) should just be an assert() will
change.

>  
> > +      }
> > +   } else {
> > +      /* Remove the limiting terminator.  Loop analysis will only
> > find a
> > +       * terminator for trival if statments (then only contains
> > break, else
> > +       * is empty) so its safe to remove the whole thing.
> > +       */
> > +      nir_cf_node_remove(limiting_trm);
> > +   }
> > +
> > +   list_for_each_entry(loop_term_phi_info, ltpi, &loop_term_phis,
> > list_link) {
> > +      nir_phi_instr *new_phi = create_complex_unroll_phi(ns, ltpi-
> > >phi);
> > +
> > +      /* Update outermost_phi_def to point to the replacement phi
> > to be used
> > +       * when rewritting the loops existing phis.
> > +       */
> > +      ltpi->outermost_phi_def = &new_phi->dest.ssa;
> > +
> > +      /* Get the src for when exiting by the loop terminator */
> > +      struct hash_entry *final_term_hte =
> > +         _mesa_hash_table_search(final_term_phis, ltpi->phi);
> > +      if (final_term_hte) {
> > +         ltpi->innermost_phi_src_def = (nir_ssa_def *)
> > final_term_hte->data;
> > +      } else {
> > +         ltpi->innermost_phi_src_def = NULL;
> > +      }
> > +
> > +      /* Update phi to be used in next iteration */
> > +      ltpi->phi = new_phi;
> > +   }
> > +
> > +   /* Move everything after the terminator we don't have a trip
> > count for
> > +    * inside the if.
> > +    */
> > +   nir_cf_list loop_end;
> > +   nir_cf_extract(&loop_end, nir_after_cf_node(if_node),
> > +                  nir_after_block(nir_loop_last_block(loop)));
> > +   nir_if *if_stmt = nir_cf_node_as_if(if_node);
> > +   move_cf_list_into_if(&loop_end, if_node, last_blk,
> > +                        continue_from_then_branch);
> > +
> > +   /* Pluck out the loop body. Unlike the simple unroll pass there
> > are no
> > +    * breaks remaining in the loop so we do not have the concept
> > of a loop
> > +    * header and a loop body, instead we just extract everything.
> > +    */
> > +   nir_cf_list loop_body;
> > +   nir_cf_extract(&loop_body, nir_before_cf_node(&header_blk-
> > >cf_node),
> > +                  nir_after_block(nir_loop_last_block(loop)));
> > +
> > +   /* Temp list to store the cloned loop body as we unroll */
> > +   nir_cf_list unrolled_lp_body;
> > +
> > +   /* Set the cursor to before the loop */
> > +   b->cursor = nir_before_cf_node(&loop->cf_node);
> > +
> > +   nir_block *continue_from_blk = NULL;
> > +   for (unsigned i = 0; i < loop->info->trip_count; i++) {
> > +      /* Clone loop body */
> > +      nir_cf_list_clone(&unrolled_lp_body, &loop_body, loop-
> > >cf_node.parent,
> > +                        remap_table);
> > +
> > +      nir_cf_node *last_node =
> 
> This sort-of aliases "last_block".  I think it's probably
> "last_block" that needs a new name, but one of them does especially
> given that this is also a block.
>  
> > +         exec_node_data(nir_cf_node,
> > +                       
> > exec_list_get_tail(&unrolled_lp_body.list), node);
> > +      assert(last_node->type == nir_cf_node_block &&
> > +             exec_list_is_empty(&nir_cf_node_as_block(last_node)-
> > >instr_list));
> > +
> > +      /* Insert unrolled loop body */
> > +      nir_cf_reinsert(&unrolled_lp_body, b->cursor);
> > +
> > +      nir_cf_node *if_node = nir_cf_node_prev(last_node);
> 
> This aliases the if_node argument.  And, no, changing it to last_if
> doesn't work either because that would alias a different argument. :-
> )
>  
> > +      assert(if_node->type == nir_cf_node_if);
> > +      if_stmt = nir_cf_node_as_if(if_node);
> > +
> > +      nir_block *exit_from_blk;
> > +      if (continue_from_then_branch) {
> > +         continue_from_blk = nir_if_last_then_block(if_stmt);
> > +         exit_from_blk = nir_if_last_else_block(if_stmt);
> > +      } else {
> > +         exit_from_blk = nir_if_last_then_block(if_stmt);
> > +         continue_from_blk = nir_if_last_else_block(if_stmt);
> > +      }
> > +
> > +      b->cursor = nir_after_cf_node(if_node);
> > +      if (i < loop->info->trip_count - 1) {
> > +         list_for_each_entry(loop_term_phi_info, ltpi,
> > +                             &loop_term_phis, list_link) {
> > +            /* Insert phi created in previous iteration */
> > +            insert_phi_and_set_block_on_uses(b, ltpi->phi);
> > +            add_complex_unroll_phi_src(ltpi->phi_src_def, ltpi-
> > >phi,
> > +                                       remap_table,
> > exit_from_blk);
> > +
> > +            /* Create phi to be fixed up by next iteration */
> > +            nir_phi_instr *new_phi = create_complex_unroll_phi(ns,
> > ltpi->phi);
> > +            ltpi->phi = new_phi;
> > +         }
> > +      } else {
> > +         list_for_each_entry(loop_term_phi_info, ltpi,
> > +                             &loop_term_phis, list_link) {
> > +            /* Insert phi created in previous iteration */
> > +            insert_phi_and_set_block_on_uses(b, ltpi->phi);
> > +            add_complex_unroll_phi_src(ltpi->phi_src_def, ltpi-
> > >phi,
> > +                                       remap_table,
> > exit_from_blk);
> > +         }
> > +      }
> > +
> > +      /* Ready the remap table for the next iteration */
> > +      update_remap_table(loop, remap_table, phi_remap);
> > +
> > +      /* Set the cursor to the last if in the loop body we just
> > unrolled ready
> > +       * for the next iteration.
> > +       */
> > +      b->cursor = nir_after_block(continue_from_blk);
> > +   }
> > +
> > +
> > +   list_for_each_entry(loop_term_phi_info, ltpi, &loop_term_phis,
> > list_link) {
> > +      /* Now that the remap table is updated add the second src to
> > the
> > +       * innermost phis.
> > +       */
> > +      nir_ssa_def *phi_src = ltpi->innermost_phi_src_def ?
> > +         ltpi->innermost_phi_src_def : ltpi->phi_src_def;
> > +
> > +      assert(exec_list_length(&ltpi->phi->srcs) == 1);
> > +
> > +      add_complex_unroll_phi_src(phi_src, ltpi->phi, remap_table,
> > +                                 continue_from_blk);
> > +
> > +      /* Rewrite the uses of the old loop phis */
> > +      nir_foreach_use_safe(use_src, &ltpi->old_loop_phi->dest.ssa) 
> > {
> > +         nir_src new_src = nir_src_for_ssa(ltpi-
> > >outermost_phi_def);
> > +         nir_instr_rewrite_src(use_src->parent_instr, use_src,
> > new_src);
> > +      }
> > +
> > +      nir_foreach_if_use_safe(use_src, &ltpi->old_loop_phi-
> > >dest.ssa) {
> > +         nir_src new_src = nir_src_for_ssa(ltpi-
> > >outermost_phi_def);
> > +         nir_if_rewrite_condition(use_src->parent_if, new_src);
> > +      }
> > +   }
> > +
> > +   /* The loop has been unrolled so remove it. */
> > +   nir_cf_node_remove(&loop->cf_node);
> > +
> > +   /* Delete the original loop body */
> > +   nir_cf_delete(&loop_body);
> > +
> > +   ralloc_free(mem_ctx);
> > +}
> > +
> > +static bool
> > +process_loops(nir_cf_node *cf_node, nir_builder *b, bool
> > *innermost_loop)
> > +{
> > +   bool progress = false;
> > +   nir_loop *loop;
> > +
> > +   switch (cf_node->type) {
> > +   case nir_cf_node_block:
> > +      return progress;
> > +   case nir_cf_node_if: {
> > +      nir_if *if_stmt = nir_cf_node_as_if(cf_node);
> > +      foreach_list_typed_safe(nir_cf_node, nested_node, node,
> > &if_stmt->then_list)
> > +         progress |= process_loops(nested_node, b,
> > innermost_loop);
> > +      foreach_list_typed_safe(nir_cf_node, nested_node, node,
> > &if_stmt->else_list)
> > +         progress |= process_loops(nested_node, b,
> > innermost_loop);
> > +      return progress;
> > +   }
> > +   case nir_cf_node_loop: {
> > +      loop = nir_cf_node_as_loop(cf_node);
> > +      foreach_list_typed_safe(nir_cf_node, nested_node, node,
> > &loop->body)
> > +         progress |= process_loops(nested_node, b,
> > innermost_loop);
> > +      break;
> > +   }
> > +   default:
> > +      unreachable("unknown cf node type");
> > +   }
> > +
> > +   if (*innermost_loop) {
> > +      nir_function *fn = nir_cf_node_get_function(&loop->cf_node)-
> > >function;
> > +
> > +      /* Don't attempt to unroll outer loops or a second inner
> > loop in
> > +       * this pass wait until the next pass as we have altered the
> > cf.
> > +       */
> > +      *innermost_loop = false;
> > +
> > +      if (loop->info->limiting_terminator == NULL) {
> > +         return progress;
> > +      }
> > +
> > +      if (is_simple_loop(fn->shader, loop->info)) {
> > +         simple_unroll(loop, b);
> > +         progress = true;
> > +      } else {
> > +         /* Attempt to unroll loops with two terminators. */
> > +         if (is_complex_loop(fn->shader, loop->info)) {
> > +            bool first_terminator = true;
> > +            list_for_each_entry(nir_loop_terminator, terminator,
> > +                                &loop->info->loop_terminator_list,
> > +                                loop_terminator_link) {
> > +
> > +               nir_cf_node *if_node = &terminator->nif->cf_node;
> 
> Let's just store the nir_if here so that we don't have to convert it
> back later.  Also, caling it term_if would make things a bit easier
> to read.
>  
> > +
> > +               if (if_node == &loop->info->limiting_terminator-
> > >nif->cf_node) {
> 
> Why not just "if terminator == loop->info->limiting_terminator"?

Will change.

>  
> > +                  first_terminator = false;
> 
> Ok... This is very confusing and I think there's a bug.  You have
> this varibale here called "first_terminator" which is a 1-bit loop
> counter for this loop.  If the first terminator in the list is the
> limiting terminator, we set first_terminator to false and continue to
> the next terminator.  We then pass "!first_terminator" into
> complex_unroll as "limiting_term_second" and, if I'm doing my boolean
> logic correctly, it means the opposite of what the variable name
> implies.

Yeah I think your right. I think to begin with I might have had
something like:

  if (first_terminator &&
      if_node == &loop->info->limiting_terminator->nif->cf_node)

But thats a bug too and one that causes piglit problems with the tests
I wrote.

> 
> If this is passing piglit then maybe we don't need
> limiting_term_second at all?  Either that or we need more tests. :-)

We do need it. I think what is happening is when dealing with the
second_terminator we just end up with some extra if staments so it
probably doesn't matter code for execution correctness just extra code
paths that will never get used.

I'll try come up with something a little less confusing and more
correct.


>  
> > +                  continue;
> > +               }
> > +
> > +               /* If the first terminator has a trip count of zero
> > just do a
> > +                * simple unroll as the second terminator can never
> > be reached.
> > +                */
> > +               if (loop->info->trip_count == 0 &&
> > first_terminator) {
> > +                  simple_unroll(loop, b);
> > +                  progress = true;
> > +                  break;
> > +               }
> > +
> > +               nir_if *if_stmt = nir_cf_node_as_if(if_node);
> > +
> > +               /* Determine which if-statement branch, if any,
> > ends with a
> > +                * break. Note that since predicted_num_loop_jumps
> > == 1, it is
> > +                * impossible for both branches to end with a
> > break.
> > +                */
> > +               nir_block *last_then =
> > nir_if_last_then_block(if_stmt);
> > +               if (ends_in_break(last_then)) {
> > +                  complex_unroll(fn, loop, b, if_node, last_then,
> > false,
> > +                                 !first_terminator);
> > +
> > +                  progress = true;
> > +                  break;
> > +               } else {
> > +                  nir_block *last_else =
> > nir_if_last_else_block(if_stmt);
> 
> This would be more clear if you moved the cast up and made this an
> else if
>  
> > +                  if (ends_in_break(last_else)) {
> > +                     complex_unroll(fn, loop, b, if_node,
> > last_else, true,
> > +                                    !first_terminator);
> > +
> > +                     progress = true;
> > +                     break;
> > +                  }
> > +               }
> > +            }
> > +         }
> > +      }
> > +   }
> > +
> > +   return progress;
> > +}
> > +
> > +static bool
> > +nir_opt_loop_unroll_impl(nir_function_impl *impl,
> > +                         nir_variable_mode indirect_mask)
> > +{
> > +   bool progress = false;
> > +   nir_metadata_require(impl, nir_metadata_loop_analysis,
> > indirect_mask);
> > +   nir_metadata_require(impl, nir_metadata_block_index);
> > +
> > +   nir_builder b;
> > +   nir_builder_init(&b, impl);
> > +
> > +   foreach_list_typed_safe(nir_cf_node, node, node, &impl->body) {
> > +      bool innermost_loop = true;
> > +      progress |= process_loops(node, &b, &innermost_loop);
> > +   }
> > +
> > +   return progress;
> > +}
> > +
> > +bool
> > +nir_opt_loop_unroll(nir_shader *shader, nir_variable_mode
> > indirect_mask)
> > +{
> > +   bool progress = false;
> > +
> > +   nir_foreach_function(function, shader) {
> > +      if (function->impl) {
> > +         progress |= nir_opt_loop_unroll_impl(function->impl,
> > indirect_mask);
> > +      }
> > +   }
> > +   return false;
> > +}
> > --
> > 2.7.4
> > 
> > _______________________________________________
> > mesa-dev mailing list
> > mesa-dev at lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> > 
> 
> _______________________________________________
> mesa-dev mailing list
> mesa-dev at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev


More information about the mesa-dev mailing list