<div dir="ltr"><div class="gmail_extra"><div class="gmail_quote">Wow!  This is way better than the last time I read through it.  Good work!<br><br></div><div class="gmail_quote">Overall, I'm much happier with the code now.  The structure is better, some of the crazy phi logic is gone, and clone_cf_list is helping a lot.  That said... I still have a pile of comments.  Most of them are cosmetic, one is a bug, and one is a suggestion for how to make things simpler.  I think fixing some of the cosmetic stuff, especially in unroll_complex, will help with readability a lot.<br><br></div><div class="gmail_quote">The suggestion, which I'd like to highlight here, was to take advantage of nir_repair_ssa and see if we can git rid of a bunch of the phi node pain from unroll_complex.  Most of the pain in that part of the pass appears to be in trying to deal with the phi nodes at the end of the pass and put everything back where it belongs.  The repair_ssa pass (which I think I wrote after you started on this project) is a tiny pass that takes a shader that is in "mostly" ssa form and makes it into "proper" NIR ssa.  The only requirement on the shader is tha the sources of the phi nodes point to the correct SSA values.  Dominance doesn't have to hold and you don't have to have the whole ladder of phi nodes so long as the data flow is correct.  I *think* based on my sketchy reading of things, that we should be able to just apply the final remap table to each of the phis after the loop and then just run repair_ssa on the shader once we're done unrolling.<br></div><div class="gmail_quote"><br>On Mon, Dec 5, 2016 at 5:12 PM, Timothy Arceri <span dir="ltr"><<a href="mailto:timothy.arceri@collabora.com" target="_blank">timothy.arceri@collabora.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">V2:<br>
- tidy ups suggested by Connor.<br>
- tidy up cloning logic and handle copy propagation<br>
 based of suggestion by Connor.<br>
- use nir_ssa_def_rewrite_uses to fix up lcssa phis<br>
  suggested by Connor.<br>
- add support for complex loop unrolling (two terminators)<br>
- handle case were the ssa defs use outside the loop is already a phi<br>
- support unrolling loops with multiple terminators when trip count<br>
  is know for each terminator<br>
<br>
V3:<br>
- set correct num_components when creating phi in complex unroll<br>
- rewrite update remap table based on Jasons suggestions.<br>
- remove unrequired extract_loop_body() helper as suggested by Jason.<br>
- simplify the lcssa phi fix up code for simple loops as per Jasons suggestions.<br>
- use mem context to keep track of hash table memory as suggested by Jason.<br>
- move is_{complex,simple}_loop helpers to the unroll code<br>
- require nir_metadata_block_index<br>
- partially rewrote complex unroll to be simpler and easier to follow.<br>
<br>
V4:<br>
- use rzalloc() when creating nir_phi_src but not setting pred right away<br>
 fixes regression cause by ralloc() no longer zeroing memory.<br>
---<br>
 src/compiler/Makefile.sources          |   1 +<br>
 src/compiler/nir/nir.h                 |   2 +<br>
 src/compiler/nir/nir_opt_loop_<wbr>unroll.c | 729 ++++++++++++++++++++++++++++++<wbr>+++<br>
 3 files changed, 732 insertions(+)<br>
 create mode 100644 src/compiler/nir/nir_opt_loop_<wbr>unroll.c<br>
<br>
diff --git a/src/compiler/Makefile.<wbr>sources b/src/compiler/Makefile.<wbr>sources<br>
index d3e158a..799fb38 100644<br>
--- a/src/compiler/Makefile.<wbr>sources<br>
+++ b/src/compiler/Makefile.<wbr>sources<br>
@@ -238,6 +238,7 @@ NIR_FILES = \<br>
        nir/nir_opt_dead_cf.c \<br>
        nir/nir_opt_gcm.c \<br>
        nir/nir_opt_global_to_local.c \<br>
+       nir/nir_opt_loop_unroll.c \<br>
        nir/nir_opt_peephole_select.c \<br>
        nir/nir_opt_remove_phis.c \<br>
        nir/nir_opt_undef.c \<br>
diff --git a/src/compiler/nir/nir.h b/src/compiler/nir/nir.h<br>
index d948e97..b8813e4 100644<br>
--- a/src/compiler/nir/nir.h<br>
+++ b/src/compiler/nir/nir.h<br>
@@ -2573,6 +2573,8 @@ bool nir_opt_dead_cf(nir_shader *shader);<br>
<br>
 bool nir_opt_gcm(nir_shader *shader, bool value_number);<br>
<br>
+bool nir_opt_loop_unroll(nir_shader *shader, nir_variable_mode indirect_mask);<br>
+<br>
 bool nir_opt_peephole_select(nir_<wbr>shader *shader, unsigned limit);<br>
<br>
 bool nir_opt_remove_phis(nir_shader *shader);<br>
diff --git a/src/compiler/nir/nir_opt_<wbr>loop_unroll.c b/src/compiler/nir/nir_opt_<wbr>loop_unroll.c<br>
new file mode 100644<br>
index 0000000..8715757<br>
--- /dev/null<br>
+++ b/src/compiler/nir/nir_opt_<wbr>loop_unroll.c<br>
@@ -0,0 +1,729 @@<br>
+/*<br>
+ * Copyright © 2016 Intel Corporation<br>
+ *<br>
+ * Permission is hereby granted, free of charge, to any person obtaining a<br>
+ * copy of this software and associated documentation files (the "Software"),<br>
+ * to deal in the Software without restriction, including without limitation<br>
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,<br>
+ * and/or sell copies of the Software, and to permit persons to whom the<br>
+ * Software is furnished to do so, subject to the following conditions:<br>
+ *<br>
+ * The above copyright notice and this permission notice (including the next<br>
+ * paragraph) shall be included in all copies or substantial portions of the<br>
+ * Software.<br>
+ *<br>
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR<br>
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,<br>
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL<br>
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER<br>
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING<br>
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER<br>
+ * DEALINGS IN THE SOFTWARE.<br>
+ */<br>
+<br>
+#include "nir.h"<br>
+#include "nir_builder.h"<br>
+#include "nir_control_flow.h"<br>
+<br>
+static bool<br>
+is_loop_small_enough_to_<wbr>unroll(nir_shader *shader, nir_loop_info *li)<br>
+{<br>
+   unsigned max_iter = shader->options->max_unroll_<wbr>iterations;<br>
+<br>
+   if (li->trip_count > max_iter)<br>
+      return false;<br>
+<br>
+   if (li->force_unroll)<br>
+      return true;<br>
+<br>
+   bool loop_not_too_large =<br>
+      li->num_instructions * li->trip_count <= max_iter * 25;<br>
+<br>
+   return loop_not_too_large;<br>
+}<br>
+<br>
+static bool<br>
+is_complex_loop(nir_shader *shader, nir_loop_info *li)<br></blockquote><div><br></div><div>is_complex_loop is kind-of a lie here since it also decides whether or not to unroll...<br></div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
+{<br>
+   unsigned num_lt = list_length(&li->loop_<wbr>terminator_list);<br>
+   return is_loop_small_enough_to_<wbr>unroll(shader, li) && num_lt == 2;<br>
+}<br>
+<br>
+static bool<br>
+is_simple_loop(nir_shader *shader, nir_loop_info *li)<br></blockquote><div><br></div><div>Same here.<br></div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
+{<br>
+   return li->is_trip_count_known &&<br>
+          is_loop_small_enough_to_<wbr>unroll(shader, li);<br>
+}<br>
+<br>
+static void<br>
+move_cf_list_into_if(nir_cf_<wbr>list *lst, nir_cf_node *if_node,<br>
+                     nir_block *last_blk, bool continue_from_then_branch)<br>
+{<br>
+   nir_if *if_stmt = nir_cf_node_as_if(if_node);<br>
+   if (continue_from_then_branch) {<br>
+      /* Move the rest of the loop inside the then */<br>
+      nir_cf_reinsert(lst, nir_after_block(nir_if_last_<wbr>then_block(if_stmt))); <br></blockquote><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
+   } else {<br>
+      /* Move the rest of the loop inside the else */<br>
+      nir_cf_reinsert(lst, nir_after_block(nir_if_last_<wbr>else_block(if_stmt)));<br>
+   }<br>
+<br>
+   /* Remove the break */<br>
+   nir_instr_remove(nir_block_<wbr>last_instr(last_blk));<br>
+}<br>
+<br>
+static void<br>
+create_remap_tables(void *mem_ctx, nir_loop *loop, nir_block *loop_header_blk,<br>
+                    struct hash_table **remap_table,<br>
+                    struct hash_table **phi_remap)<br>
+{<br>
+   *remap_table = _mesa_hash_table_create(mem_<wbr>ctx, _mesa_hash_pointer,<br>
+                                          _mesa_key_pointer_equal);<br>
+   *phi_remap = _mesa_hash_table_create(mem_<wbr>ctx, _mesa_hash_pointer,<br>
+                                        _mesa_key_pointer_equal);<br>
+<br>
+   /* Build table to be used for remapping as we unroll. */<br>
+   nir_foreach_instr(instr, loop_header_blk) {<br>
+      if (instr->type != nir_instr_type_phi)<br>
+         break;<br>
+<br>
+      nir_phi_instr *phi = nir_instr_as_phi(instr);<br>
+      nir_foreach_phi_src(src, phi) {<br>
+         /* Add src to remap table if it's pred is from before the loop, these<br>
+          * srcs will be used in the first iteration of cloning the loop.<br>
+          */<br>
+         if (src->pred->index < phi->instr.block->index) {<br>
+            _mesa_hash_table_insert(*<wbr>remap_table, &phi->dest.ssa,<br>
+                                    src->src.ssa);<br>
+         }<br>
+      }<br>
+   }<br>
+}<br>
+<br>
+static void<br>
+update_remap_table(nir_loop *loop, struct hash_table *remap_table,<br>
+                   struct hash_table *phi_remap)<br>
+{<br>
+   nir_block *header_block = nir_loop_first_block(loop);<br>
+   nir_foreach_instr(instr, header_block) {<br>
+<br></blockquote><div><br></div><div>Unneeded newline<br></div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
+      if (instr->type != nir_instr_type_phi)<br>
+         break;<br>
+<br>
+      nir_phi_instr *phi = nir_instr_as_phi(instr);<br>
+      nir_foreach_phi_src(src, phi) {<br>
+         if (src->pred->index < phi->instr.block->index)<br>
+            continue;<br>
+<br>
+         struct hash_entry *remap_he =<br>
+            _mesa_hash_table_search(remap_<wbr>table, src->src.ssa);<br>
+         if (remap_he) {<br>
+            _mesa_hash_table_insert(phi_<wbr>remap, &phi->dest.ssa, remap_he->data);<br>
+         } else {<br>
+            _mesa_hash_table_insert(phi_<wbr>remap, &phi->dest.ssa, src->src.ssa);</blockquote><div><br></div><div>I wonder if this case is ever anything other than a no-op...<br></div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
+         }<br>
+      }<br>
+   }<br>
+<br>
+   struct hash_entry *phi_entry;<br>
+   hash_table_foreach(phi_remap, phi_entry)<br>
+      _mesa_hash_table_insert(remap_<wbr>table, phi_entry->key, phi_entry->data);<br>
+}<br>
+<br>
+static void<br>
+insert_phi_and_set_block_on_<wbr>uses(nir_builder *b, nir_phi_instr *phi_instr)<br>
+{<br>
+   nir_instr_insert(b->cursor, &phi_instr->instr);<br>
+<br>
+   /* Now that we have inserted the phi fix up the block for its uses. */<br>
+   nir_foreach_use_safe(use_src, &phi_instr->dest.ssa) {<br>
+      nir_phi_instr *use_phi = nir_instr_as_phi(use_src-><wbr>parent_instr);<br>
+<br>
+      foreach_list_typed(nir_phi_<wbr>src, src, node, &use_phi->srcs) {<br></blockquote><div><br></div><div>We have a nir_foreach_phi_src helper you could use here.<br></div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
+         if (!src->pred)<br>
+            src->pred = phi_instr->dest.ssa.parent_<wbr>instr->block;<br>
+      }<br>
+   }<br>
+}<br>
+<br>
+static nir_phi_instr *<br>
+create_complex_unroll_phi(<wbr>nir_shader *ns, nir_phi_instr *prev_phi_instr)<br>
+{<br>
+   nir_phi_instr *new_phi = nir_phi_instr_create(ns);<br>
+   nir_ssa_dest_init(&new_phi-><wbr>instr, &new_phi->dest,<br>
+                     prev_phi_instr->dest.ssa.num_<wbr>components,<br>
+                     prev_phi_instr->dest.ssa.bit_<wbr>size, NULL);<br>
+<br>
+   /* Add the new phi as a src to the phi from the previous iteration */<br>
+   nir_phi_src *new_src = rzalloc(prev_phi_instr, nir_phi_src);<br>
+   new_src->src = nir_src_for_ssa(&new_phi-><wbr>dest.ssa);<br>
+   new_src->src.parent_instr = &prev_phi_instr->instr;<br>
+   exec_list_push_tail(&prev_phi_<wbr>instr->srcs, &new_src->node);<br>
+   list_addtail(&new_src->src.<wbr>use_link, &new_src->src.ssa->uses);<br>
+<br>
+   return new_phi;<br>
+}<br>
+<br>
+static void<br>
+add_complex_unroll_phi_src(<wbr>nir_ssa_def *phi_src, nir_phi_instr *phi_instr,<br>
+                           struct hash_table *remap_table, nir_block *blk)<br>
+{<br>
+   struct hash_entry *hte =<br>
+      _mesa_hash_table_search(remap_<wbr>table, phi_src);<br>
+<br>
+   nir_phi_src *new_src = ralloc(phi_instr, nir_phi_src);<br>
+   nir_ssa_def *ssa_def = hte ? (nir_ssa_def *) hte->data : phi_src;<br>
+   new_src->pred = blk;<br>
+   new_src->src = nir_src_for_ssa(ssa_def);<br>
+   new_src->src.parent_instr = &phi_instr->instr;<br>
+   list_addtail(&new_src->src.<wbr>use_link, &new_src->src.ssa->uses);<br>
+<br>
+   exec_list_push_tail(&phi_<wbr>instr->srcs, &new_src->node);<br>
+}<br>
+<br>
+static void<br>
+simple_loop_fix_lcssa_phis(<wbr>nir_cf_node *loop, struct hash_table *remap_table)<br>
+{<br>
+   nir_block *prev_block = nir_cf_node_as_block(nir_cf_<wbr>node_prev(loop));<br>
+   nir_cf_node *cf_node = nir_cf_node_next(loop);<br>
+   assert(cf_node->type == nir_cf_node_block);<br>
+<br>
+   nir_block *block = nir_cf_node_as_block(cf_node);<br>
+   nir_foreach_instr_safe(instr, block) {<br>
+      if (instr->type != nir_instr_type_phi) {<br>
+         /* There should be no more phis */<br>
+         break;<br>
+      }<br></blockquote><div><br></div><div>I don't think you really need the comment here.  We do enough "if (!is_phi) break" that readers should be used to it.<br></div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
+<br>
+      /* Simple loops should only ever have a single exit point so we can<br>
+       * simply rewrite the phis uses to whatever is in the remap table or<br>
+       * the src before the loop (if it had 0 iterations) and be done.<br>
+       */<br>
+      nir_phi_instr *phi = nir_instr_as_phi(instr);<br></blockquote><div><br></div><div>Should we assert phi->is_lcssa here?  Or maybe list_is_singular(phi->srcs)?<br></div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
+      nir_foreach_phi_src(src, phi) {<br>
+         /* Update predecessor */<br>
+         src->pred = prev_block;<br>
+<br>
+         struct hash_entry *hte =<br>
+            _mesa_hash_table_search(remap_<wbr>table, src->src.ssa);<br>
+         if (hte) {<br>
+            nir_ssa_def_rewrite_uses(&phi-<wbr>>dest.ssa,<br>
+                                     nir_src_for_ssa((nir_ssa_def *) hte->data));<br>
+         } else {<br>
+            nir_ssa_def_rewrite_uses(&phi-<wbr>>dest.ssa, src->src);<br>
+         }<br>
+      }<br>
+      assert(list_empty(&phi->dest.<wbr>ssa.uses));<br>
+   }<br>
+}<br>
+<br>
+static bool<br>
+ends_in_break(nir_block *block)<br>
+{<br>
+   if (exec_list_is_empty(&block-><wbr>instr_list))<br>
+      return false;<br>
+<br>
+   nir_instr *instr = nir_block_last_instr(block);<br>
+   return instr->type == nir_instr_type_jump &&<br>
+      nir_instr_as_jump(instr)->type == nir_jump_break;<br>
+}<br></blockquote><div><br></div><div>It would make sense to move this below simple_unroll since it's not used for simple loops and it's siting between simple_unroll and simple_loop_fix_lcssa_phis<br></div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
+<br>
+/**<br>
+ * Unroll a loop which does not contain any jumps.  For example, if the input<br>
+ * is:<br>
+ *<br>
+ *     (loop (...) ...instrs...)<br>
+ *<br>
+ * And the iteration count is 3, the output will be:<br>
+ *<br>
+ *     ...instrs... ...instrs... ...instrs...<br>
+ */<br>
+static void<br>
+simple_unroll(nir_loop *loop, nir_builder *b)<br>
+{<br>
+   void *mem_ctx = ralloc_context(NULL);<br>
+<br>
+   nir_convert_loop_to_lcssa(<wbr>loop);<br>
+<br>
+   /* Get the loop header this contains a bunch of phis and the loops<br>
+    * conditional.<br>
+    */<br>
+   nir_block *header_blk = nir_loop_first_block(loop);<br>
+<br>
+   struct hash_table *remap_table;<br>
+   struct hash_table *phi_remap;<br>
+   create_remap_tables(mem_ctx, loop, header_blk, &remap_table, &phi_remap);<br>
+<br>
+   /* Skip over loop terminator and get the loop body. */<br>
+   nir_cf_node *if_node = &loop->info->limiting_<wbr>terminator->nif->cf_node;<br>
+   list_for_each_entry(nir_loop_<wbr>terminator, terminator,<br>
+                       &loop->info->loop_terminator_<wbr>list, loop_terminator_link) {<br>
+       nir_cf_node *loop_node = &terminator->nif->cf_node;<br></blockquote><div><br></div><div>loop_node is an odd name here.  It makes me think it's &loop->cf_node.  How about term_node?  Or, for that matter, it looks like what you realyly want to compare below is the if's and not the cf_nodes' so maybe term_if?<br></div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
+<br>
+      /* Remove all but the limiting terminator as we know the other exit<br>
+       * conditions can never be met.<br>
+       */<br>
+      if (loop_node != &loop->info->limiting_<wbr>terminator->nif->cf_node) {<br>
+         nir_cf_node_remove(loop_node);<br>
+      }<br>
+   }<br>
+<br></blockquote><div><br></div><div>It might be worth putting a comment in here about how the analysis pass bails on anything other than simple "if (foo) break" conditions.  If it's pratical to do so, I'd kind-of like to see an assert.<br></div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
+   /* Pluck out the loop header */<br>
+   nir_cf_list lp_header;<br>
+   nir_cf_extract(&lp_header, nir_before_block(header_blk),<br>
+                  nir_before_cf_node(if_node));<br>
+<br>
+   /* Pluck out the loop body */<br>
+   nir_cf_list loop_body;<br>
+   nir_cf_extract(&loop_body, nir_after_cf_node(if_node),<br>
+                  nir_after_block(nir_loop_last_<wbr>block(loop)));<br>
+<br>
+   /* Clone the loop header */<br>
+   nir_cf_list cloned_header;<br>
+   nir_cf_list_clone(&cloned_<wbr>header, &lp_header, loop->cf_node.parent,<br>
+                     remap_table);<br>
+<br>
+   /* Insert cloned loop header before the loop */<br>
+   b->cursor = nir_before_cf_node(&loop->cf_<wbr>node);<br>
+   nir_cf_reinsert(&cloned_<wbr>header, b->cursor);<br>
+<br>
+   /* Temp list to store the cloned loop body as we unroll */<br>
+   nir_cf_list unrolled_lp_body;<br>
+<br>
+   /* Clone loop header and append to the loop body */<br>
+   for (unsigned i = 0; i < loop->info->trip_count; i++) {<br>
+      /* Clone loop body */<br>
+      nir_cf_list_clone(&unrolled_<wbr>lp_body, &loop_body, loop->cf_node.parent,<br>
+                        remap_table);<br>
+<br>
+      update_remap_table(loop, remap_table, phi_remap);<br>
+<br>
+      /* Insert unrolled loop body before the loop */<br>
+      b->cursor = nir_before_cf_node(&loop->cf_<wbr>node);<br>
+      nir_cf_reinsert(&unrolled_lp_<wbr>body, b->cursor);<br>
+<br>
+      /* Clone loop header */<br>
+      nir_cf_list_clone(&cloned_<wbr>header, &lp_header, loop->cf_node.parent,<br>
+                        remap_table);<br>
+<br>
+      /* Insert loop header after loop body */<br>
+      b->cursor = nir_before_cf_node(&loop->cf_<wbr>node);<br>
+      nir_cf_reinsert(&cloned_<wbr>header, b->cursor);<br>
+   }<br>
+<br>
+   /* The loop has been unrolled so remove it. */<br>
+   simple_loop_fix_lcssa_phis(&<wbr>loop->cf_node, remap_table);<br>
+<br>
+   /* Remove the loop */<br>
+   nir_cf_node_remove(&loop->cf_<wbr>node);<br>
+<br>
+   /* Delete the original loop body & header */<br>
+   nir_cf_delete(&lp_header);<br>
+   nir_cf_delete(&loop_body);<br>
+<br>
+   ralloc_free(mem_ctx);<br>
+}<br>
+<br>
+typedef struct {<br>
+   /* This field is updated at each interation as we unroll. */<br>
+   nir_phi_instr *phi;<br>
+<br>
+   /* This is the phi src of the loop terminator with an unknown trip count */<br>
+   nir_ssa_def *phi_src_def;<br>
+<br>
+   /* These fields are used to fix things up after we have unrolled. */<br>
+   nir_phi_instr *old_loop_phi;<br>
+   nir_ssa_def *outermost_phi_def;<br>
+   nir_ssa_def *innermost_phi_src_def;<br>
+<br>
+   struct list_head list_link;<br>
+} loop_term_phi_info;<br></blockquote><div><br></div><div>You seem to be going through a lot of pain to keep track of those pesky phi nodes after the loop.  I think this may be a case where we just want to let them be what they are and then run nir_repair_ssa after loop unrolling.<br></div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
+<br>
+/**<br>
+ * Unroll a loop with two exists when the trip count of one of the exits is<br>
+ * unknown.  If continue_from_then_branch is true, the loop is repeated only<br>
+ * when the "then" branch of the if is taken; otherwise it is repeated only<br>
+ * when the "else" branch of the if is taken.<br>
+ *<br>
+ * For example, if the input is:<br>
+ *<br>
+ *     (loop (...)<br>
+ *      ...body...<br>
+ *      (if (cond)<br>
+ *          (...then_instrs...)<br>
+ *        (...else_instrs...)))<br>
+ *<br>
+ * And the iteration count is 3, and \c continue_from_then_branch is true,<br>
+ * then the output will be:<br>
+ *<br>
+ *     ...body...<br>
+ *     (if (cond)<br>
+ *         (...then_instrs...<br>
+ *          ...body...<br>
+ *          (if (cond)<br>
+ *              (...then_instrs...<br>
+ *               ...body...<br>
+ *               (if (cond)<br>
+ *                   (...then_instrs...)<br>
+ *                 (...else_instrs...)))<br>
+ *            (...else_instrs...)))<br>
+ *       (...else_instrs))<br></blockquote><div><br></div><div>Mind writing this in nir_print style?<br></div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
+ */<br>
+static void<br>
+complex_unroll(nir_function *fn, nir_loop *loop, nir_builder *b,<br>
+               nir_cf_node *if_node, nir_block *last_blk,<br></blockquote><div><br></div><div>Please replace if_node here with a nir_if that's called unknown_if or unlimit_if or something like that.<br></div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
+               bool continue_from_then_branch, bool limiting_term_second)<br>
+{<br>
+   void *mem_ctx = ralloc_context(NULL);<br>
+<br>
+   nir_convert_loop_to_lcssa(<wbr>loop);<br>
+<br>
+   nir_shader *ns = fn->shader;<br>
+   nir_cf_node *limiting_trm = &loop->info->limiting_<wbr>terminator->nif->cf_node;<br></blockquote><div><br></div><div>And make this a nir_if *limit_if<br><br></div><div>That way you can keep track of what is what as you read the code.  This is code hard enough to dig through without having to chase variable names around.<br></div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
+   nir_block *header_blk = nir_loop_first_block(loop);<br>
+<br>
+   struct hash_table *remap_table;<br>
+   struct hash_table *phi_remap;<br>
+   create_remap_tables(mem_ctx, loop, header_blk, &remap_table, &phi_remap);<br>
+<br>
+   struct hash_table *final_term_phis =<br>
+      _mesa_hash_table_create(mem_<wbr>ctx, _mesa_hash_pointer,<br>
+                              _mesa_key_pointer_equal);<br>
+<br>
+   struct list_head loop_term_phis;<br>
+   list_inithead(&loop_term_phis)<wbr>;<br>
+<br>
+   /* Build a list of phis */<br>
+   nir_if *loop_term_if = loop->info->limiting_<wbr>terminator->nif;<br>
+   nir_cf_node *after_loop = nir_cf_node_next(&loop->cf_<wbr>node);<br>
+   nir_foreach_instr(instr, nir_cf_node_as_block(after_<wbr>loop)) {<br>
+      if (instr->type != nir_instr_type_phi)<br>
+         break;<br>
+<br>
+      nir_phi_instr *phi = nir_instr_as_phi(instr);<br>
+      nir_foreach_phi_src(src, phi) {<br>
+         nir_block *then_blk = nir_if_last_then_block(loop_<wbr>term_if);<br>
+         nir_block *else_blk = nir_if_last_else_block(loop_<wbr>term_if);<br>
+<br>
+         if (src->pred == then_blk || src->pred == else_blk) {<br>
+            _mesa_hash_table_insert(final_<wbr>term_phis, phi, src->src.ssa);<br>
+         } else {<br>
+            /* Add to list of phis with an unknown iteration count */<br>
+            loop_term_phi_info *ltpi = ralloc(mem_ctx, loop_term_phi_info);<br>
+            ltpi->phi = phi;<br>
+            ltpi->old_loop_phi = phi;<br>
+            ltpi->phi_src_def = src->src.ssa;<br>
+            list_addtail(&ltpi->list_link, &loop_term_phis);<br>
+         }<br>
+      }<br>
+   }<br>
+<br>
+   if (limiting_term_second) {<br>
+      /* We need some special handling when its the second terminator causing<br>
+       * us to exit the loop for example:<br>
+       *<br>
+       *   for (int i = 0; i < uniform_lp_count; i++) {<br>
+       *      colour = vec4(0.0, 1.0, 0.0, 1.0);<br>
+       *<br>
+       *      if (i == 1)<br></blockquote><div><br></div><div>Missing a "{" to match the "}" below<br></div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
+       *         break;<br>
+       *      }<br>
+       *      ... any further code is unreachable after i == 1 ...<br>
+       *   }<br>
+       *<br>
+       * Bump the trip count by one so we actually clone something. Also<br>
+       * extract everything after the limiting terminator and insert it into<br>
+       * the branch we will continue from.<br>
+       */<br>
+      loop->info->trip_count++;<br></blockquote><div><br></div><div>I get the rest of this but not why we're bumping the trip count.  I mean, the way that complex unrolling works, bumping the trip count doesn't hurt but I don't see what it's gaining us either.<br></div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
+<br>
+      nir_cf_list after_lt;<br>
+      nir_cf_extract(&after_lt, nir_after_cf_node(limiting_<wbr>trm),<br>
+                     nir_after_block(nir_loop_last_<wbr>block(loop)));<br></blockquote><div><br></div><div>Are we guaranteed that the ifs aren't nested?<br></div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
+<br>
+      nir_if *if_stmt = loop->info->limiting_<wbr>terminator->nif;<br></blockquote><div><br></div><div>No.  We now have if_stmt and if_node and they point to two completely different ifs.  Let's name things better. :-)<br></div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
+      nir_block *last_then = nir_if_last_then_block(if_<wbr>stmt);<br>
+      if (ends_in_break(last_then)) {<br>
+         move_cf_list_into_if(&after_<wbr>lt, limiting_trm, last_then, false);<br>
+      } else {<br>
+         nir_block *last_else = nir_if_last_else_block(if_<wbr>stmt);<br>
+         if (ends_in_break(last_else)) {<br>
+            move_cf_list_into_if(&after_<wbr>lt, limiting_trm, last_else, true);<br>
+         }<br></blockquote><div><br></div><div>And if neither of them contains the break we throw it away???  I think something needs to be asserted here.<br></div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
+      }<br>
+   } else {<br>
+      /* Remove the limiting terminator.  Loop analysis will only find a<br>
+       * terminator for trival if statments (then only contains break, else<br>
+       * is empty) so its safe to remove the whole thing.<br>
+       */<br>
+      nir_cf_node_remove(limiting_<wbr>trm);<br>
+   }<br>
+<br>
+   list_for_each_entry(loop_term_<wbr>phi_info, ltpi, &loop_term_phis, list_link) {<br>
+      nir_phi_instr *new_phi = create_complex_unroll_phi(ns, ltpi->phi);<br>
+<br>
+      /* Update outermost_phi_def to point to the replacement phi to be used<br>
+       * when rewritting the loops existing phis.<br>
+       */<br>
+      ltpi->outermost_phi_def = &new_phi->dest.ssa;<br>
+<br>
+      /* Get the src for when exiting by the loop terminator */<br>
+      struct hash_entry *final_term_hte =<br>
+         _mesa_hash_table_search(final_<wbr>term_phis, ltpi->phi);<br>
+      if (final_term_hte) {<br>
+         ltpi->innermost_phi_src_def = (nir_ssa_def *) final_term_hte->data;<br>
+      } else {<br>
+         ltpi->innermost_phi_src_def = NULL;<br>
+      }<br>
+<br>
+      /* Update phi to be used in next iteration */<br>
+      ltpi->phi = new_phi;<br>
+   }<br>
+<br>
+   /* Move everything after the terminator we don't have a trip count for<br>
+    * inside the if.<br>
+    */<br>
+   nir_cf_list loop_end;<br>
+   nir_cf_extract(&loop_end, nir_after_cf_node(if_node),<br>
+                  nir_after_block(nir_loop_last_<wbr>block(loop)));<br>
+   nir_if *if_stmt = nir_cf_node_as_if(if_node);<br>
+   move_cf_list_into_if(&loop_<wbr>end, if_node, last_blk,<br>
+                        continue_from_then_branch);<br>
+<br>
+   /* Pluck out the loop body. Unlike the simple unroll pass there are no<br>
+    * breaks remaining in the loop so we do not have the concept of a loop<br>
+    * header and a loop body, instead we just extract everything.<br>
+    */<br>
+   nir_cf_list loop_body;<br>
+   nir_cf_extract(&loop_body, nir_before_cf_node(&header_<wbr>blk->cf_node),<br>
+                  nir_after_block(nir_loop_last_<wbr>block(loop)));<br>
+<br>
+   /* Temp list to store the cloned loop body as we unroll */<br>
+   nir_cf_list unrolled_lp_body;<br>
+<br>
+   /* Set the cursor to before the loop */<br>
+   b->cursor = nir_before_cf_node(&loop->cf_<wbr>node);<br>
+<br>
+   nir_block *continue_from_blk = NULL;<br>
+   for (unsigned i = 0; i < loop->info->trip_count; i++) {<br>
+      /* Clone loop body */<br>
+      nir_cf_list_clone(&unrolled_<wbr>lp_body, &loop_body, loop->cf_node.parent,<br>
+                        remap_table);<br>
+<br>
+      nir_cf_node *last_node =<br></blockquote><div><br></div><div>This sort-of aliases "last_block".  I think it's probably "last_block" that needs a new name, but one of them does especially given that this is also a block.<br></div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
+         exec_node_data(nir_cf_node,<br>
+                        exec_list_get_tail(&unrolled_<wbr>lp_body.list), node);<br>
+      assert(last_node->type == nir_cf_node_block &&<br>
+             exec_list_is_empty(&nir_cf_<wbr>node_as_block(last_node)-><wbr>instr_list));<br>
+<br>
+      /* Insert unrolled loop body */<br>
+      nir_cf_reinsert(&unrolled_lp_<wbr>body, b->cursor);<br>
+<br>
+      nir_cf_node *if_node = nir_cf_node_prev(last_node);<br></blockquote><div><br></div><div>This aliases the if_node argument.  And, no, changing it to last_if doesn't work either because that would alias a different argument. :-)<br></div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
+      assert(if_node->type == nir_cf_node_if);<br>
+      if_stmt = nir_cf_node_as_if(if_node);<br>
+<br>
+      nir_block *exit_from_blk;<br>
+      if (continue_from_then_branch) {<br>
+         continue_from_blk = nir_if_last_then_block(if_<wbr>stmt);<br>
+         exit_from_blk = nir_if_last_else_block(if_<wbr>stmt);<br>
+      } else {<br>
+         exit_from_blk = nir_if_last_then_block(if_<wbr>stmt);<br>
+         continue_from_blk = nir_if_last_else_block(if_<wbr>stmt);<br>
+      }<br>
+<br>
+      b->cursor = nir_after_cf_node(if_node);<br>
+      if (i < loop->info->trip_count - 1) {<br>
+         list_for_each_entry(loop_term_<wbr>phi_info, ltpi,<br>
+                             &loop_term_phis, list_link) {<br>
+            /* Insert phi created in previous iteration */<br>
+            insert_phi_and_set_block_on_<wbr>uses(b, ltpi->phi);<br>
+            add_complex_unroll_phi_src(<wbr>ltpi->phi_src_def, ltpi->phi,<br>
+                                       remap_table, exit_from_blk);<br>
+<br>
+            /* Create phi to be fixed up by next iteration */<br>
+            nir_phi_instr *new_phi = create_complex_unroll_phi(ns, ltpi->phi);<br>
+            ltpi->phi = new_phi;<br>
+         }<br>
+      } else {<br>
+         list_for_each_entry(loop_term_<wbr>phi_info, ltpi,<br>
+                             &loop_term_phis, list_link) {<br>
+            /* Insert phi created in previous iteration */<br>
+            insert_phi_and_set_block_on_<wbr>uses(b, ltpi->phi);<br>
+            add_complex_unroll_phi_src(<wbr>ltpi->phi_src_def, ltpi->phi,<br>
+                                       remap_table, exit_from_blk);<br>
+         }<br>
+      }<br>
+<br>
+      /* Ready the remap table for the next iteration */<br>
+      update_remap_table(loop, remap_table, phi_remap);<br>
+<br>
+      /* Set the cursor to the last if in the loop body we just unrolled ready<br>
+       * for the next iteration.<br>
+       */<br>
+      b->cursor = nir_after_block(continue_from_<wbr>blk);<br>
+   }<br>
+<br>
+<br>
+   list_for_each_entry(loop_term_<wbr>phi_info, ltpi, &loop_term_phis, list_link) {<br>
+      /* Now that the remap table is updated add the second src to the<br>
+       * innermost phis.<br>
+       */<br>
+      nir_ssa_def *phi_src = ltpi->innermost_phi_src_def ?<br>
+         ltpi->innermost_phi_src_def : ltpi->phi_src_def;<br>
+<br>
+      assert(exec_list_length(&ltpi-<wbr>>phi->srcs) == 1);<br>
+<br>
+      add_complex_unroll_phi_src(<wbr>phi_src, ltpi->phi, remap_table,<br>
+                                 continue_from_blk);<br>
+<br>
+      /* Rewrite the uses of the old loop phis */<br>
+      nir_foreach_use_safe(use_src, &ltpi->old_loop_phi->dest.ssa) {<br>
+         nir_src new_src = nir_src_for_ssa(ltpi-><wbr>outermost_phi_def);<br>
+         nir_instr_rewrite_src(use_src-<wbr>>parent_instr, use_src, new_src);<br>
+      }<br>
+<br>
+      nir_foreach_if_use_safe(use_<wbr>src, &ltpi->old_loop_phi->dest.ssa) {<br>
+         nir_src new_src = nir_src_for_ssa(ltpi-><wbr>outermost_phi_def);<br>
+         nir_if_rewrite_condition(use_<wbr>src->parent_if, new_src);<br>
+      }<br>
+   }<br>
+<br>
+   /* The loop has been unrolled so remove it. */<br>
+   nir_cf_node_remove(&loop->cf_<wbr>node);<br>
+<br>
+   /* Delete the original loop body */<br>
+   nir_cf_delete(&loop_body);<br>
+<br>
+   ralloc_free(mem_ctx);<br>
+}<br>
+<br>
+static bool<br>
+process_loops(nir_cf_node *cf_node, nir_builder *b, bool *innermost_loop)<br>
+{<br>
+   bool progress = false;<br>
+   nir_loop *loop;<br>
+<br>
+   switch (cf_node->type) {<br>
+   case nir_cf_node_block:<br>
+      return progress;<br>
+   case nir_cf_node_if: {<br>
+      nir_if *if_stmt = nir_cf_node_as_if(cf_node);<br>
+      foreach_list_typed_safe(nir_<wbr>cf_node, nested_node, node, &if_stmt->then_list)<br>
+         progress |= process_loops(nested_node, b, innermost_loop);<br>
+      foreach_list_typed_safe(nir_<wbr>cf_node, nested_node, node, &if_stmt->else_list)<br>
+         progress |= process_loops(nested_node, b, innermost_loop);<br>
+      return progress;<br>
+   }<br>
+   case nir_cf_node_loop: {<br>
+      loop = nir_cf_node_as_loop(cf_node);<br>
+      foreach_list_typed_safe(nir_<wbr>cf_node, nested_node, node, &loop->body)<br>
+         progress |= process_loops(nested_node, b, innermost_loop);<br>
+      break;<br>
+   }<br>
+   default:<br>
+      unreachable("unknown cf node type");<br>
+   }<br>
+<br>
+   if (*innermost_loop) {<br>
+      nir_function *fn = nir_cf_node_get_function(&<wbr>loop->cf_node)->function;<br>
+<br>
+      /* Don't attempt to unroll outer loops or a second inner loop in<br>
+       * this pass wait until the next pass as we have altered the cf.<br>
+       */<br>
+      *innermost_loop = false;<br>
+<br>
+      if (loop->info->limiting_<wbr>terminator == NULL) {<br>
+         return progress;<br>
+      }<br>
+<br>
+      if (is_simple_loop(fn->shader, loop->info)) {<br>
+         simple_unroll(loop, b);<br>
+         progress = true;<br>
+      } else {<br>
+         /* Attempt to unroll loops with two terminators. */<br>
+         if (is_complex_loop(fn->shader, loop->info)) {<br>
+            bool first_terminator = true;<br>
+            list_for_each_entry(nir_loop_<wbr>terminator, terminator,<br>
+                                &loop->info->loop_terminator_<wbr>list,<br>
+                                loop_terminator_link) {<br>
+<br>
+               nir_cf_node *if_node = &terminator->nif->cf_node;<br></blockquote><div><br></div><div>Let's just store the nir_if here so that we don't have to convert it back later.  Also, caling it term_if would make things a bit easier to read.<br></div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
+<br>
+               if (if_node == &loop->info->limiting_<wbr>terminator->nif->cf_node) {<br></blockquote><div><br></div><div>Why not just "if terminator == loop->info->limiting_terminator"?<br></div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
+                  first_terminator = false;<br></blockquote><div><br></div><div>Ok... This is very confusing and I think there's a bug.  You have this varibale here called "first_terminator" which is a 1-bit loop counter for this loop.  If the first terminator in the list is the limiting terminator, we set first_terminator to false and continue to the next terminator.  We then pass "!first_terminator" into complex_unroll as "limiting_term_second" and, if I'm doing my boolean logic correctly, it means the opposite of what the variable name implies.<br><br></div><div>If this is passing piglit then maybe we don't need limiting_term_second at all?  Either that or we need more tests. :-)<br></div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
+                  continue;<br>
+               }<br>
+<br>
+               /* If the first terminator has a trip count of zero just do a<br>
+                * simple unroll as the second terminator can never be reached.<br>
+                */<br>
+               if (loop->info->trip_count == 0 && first_terminator) {<br>
+                  simple_unroll(loop, b);<br>
+                  progress = true;<br>
+                  break;<br>
+               }<br>
+<br>
+               nir_if *if_stmt = nir_cf_node_as_if(if_node);<br>
+<br>
+               /* Determine which if-statement branch, if any, ends with a<br>
+                * break. Note that since predicted_num_loop_jumps == 1, it is<br>
+                * impossible for both branches to end with a break.<br>
+                */<br>
+               nir_block *last_then = nir_if_last_then_block(if_<wbr>stmt);<br>
+               if (ends_in_break(last_then)) {<br>
+                  complex_unroll(fn, loop, b, if_node, last_then, false,<br>
+                                 !first_terminator);<br>
+<br>
+                  progress = true;<br>
+                  break;<br>
+               } else {<br>
+                  nir_block *last_else = nir_if_last_else_block(if_<wbr>stmt);<br></blockquote><div><br></div><div>This would be more clear if you moved the cast up and made this an else if<br></div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
+                  if (ends_in_break(last_else)) {<br>
+                     complex_unroll(fn, loop, b, if_node, last_else, true,<br>
+                                    !first_terminator);<br>
+<br>
+                     progress = true;<br>
+                     break;<br>
+                  }<br>
+               }<br>
+            }<br>
+         }<br>
+      }<br>
+   }<br>
+<br>
+   return progress;<br>
+}<br>
+<br>
+static bool<br>
+nir_opt_loop_unroll_impl(nir_<wbr>function_impl *impl,<br>
+                         nir_variable_mode indirect_mask)<br>
+{<br>
+   bool progress = false;<br>
+   nir_metadata_require(impl, nir_metadata_loop_analysis, indirect_mask);<br>
+   nir_metadata_require(impl, nir_metadata_block_index);<br>
+<br>
+   nir_builder b;<br>
+   nir_builder_init(&b, impl);<br>
+<br>
+   foreach_list_typed_safe(nir_<wbr>cf_node, node, node, &impl->body) {<br>
+      bool innermost_loop = true;<br>
+      progress |= process_loops(node, &b, &innermost_loop);<br>
+   }<br>
+<br>
+   return progress;<br>
+}<br>
+<br>
+bool<br>
+nir_opt_loop_unroll(nir_<wbr>shader *shader, nir_variable_mode indirect_mask)<br>
+{<br>
+   bool progress = false;<br>
+<br>
+   nir_foreach_function(function, shader) {<br>
+      if (function->impl) {<br>
+         progress |= nir_opt_loop_unroll_impl(<wbr>function->impl, indirect_mask);<br>
+      }<br>
+   }<br>
+   return false;<br>
+}<br>
<span class="HOEnZb"><font color="#888888">--<br>
2.7.4<br>
<br>
______________________________<wbr>_________________<br>
mesa-dev mailing list<br>
<a href="mailto:mesa-dev@lists.freedesktop.org">mesa-dev@lists.freedesktop.org</a><br>
<a href="https://lists.freedesktop.org/mailman/listinfo/mesa-dev" rel="noreferrer" target="_blank">https://lists.freedesktop.org/<wbr>mailman/listinfo/mesa-dev</a><br>
</font></span></blockquote></div><br></div></div>