<div dir="ltr"><br><div class="gmail_extra"><br><div class="gmail_quote">On Sun, Jan 4, 2015 at 1:01 PM, Connor Abbott <span dir="ltr"><<a href="mailto:cwabbott0@gmail.com" target="_blank">cwabbott0@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div>Ok, I'm going to try reviewing this again. I'm pasting the latest version of the file from review/nir-v1 and replying to that so that I won't get confused between all the various changes and reorganizing things. Here we go!</div><div><br></div><div>> /*</div><span class=""><div>> * Copyright © 2014 Intel Corporation</div></span><div>> *</div><span class=""><div>> * Permission is hereby granted, free of charge, to any person obtaining a</div></span><span class=""><div>> * copy of this software and associated documentation files (the "Software"),</div></span><span class=""><div>> * to deal in the Software without restriction, including without limitation</div></span><span class=""><div>> * the rights to use, copy, modify, merge, publish, distribute, sublicense,</div></span><span class=""><div>> * and/or sell copies of the Software, and to permit persons to whom the</div></span><span class=""><div>> * Software is furnished to do so, subject to the following conditions:</div></span><div>> *</div><span class=""><div>> * The above copyright notice and this permission notice (including the next</div></span><span class=""><div>> * paragraph) shall be included in all copies or substantial portions of the</div></span><div>> * Software.</div><div>> *</div><span class=""><div>> * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR</div></span><span class=""><div>> * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,</div></span><span class=""><div>> * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL</div></span><span class=""><div>> * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER</div></span><span class=""><div>> * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING</div></span><span class=""><div>> * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS</div></span><div>> * IN THE SOFTWARE.</div><div>> *</div><div>> * Authors:</div><div>> * Jason Ekstrand (<a href="mailto:jason@jlekstrand.net" target="_blank">jason@jlekstrand.net</a>)</div><div>> *</div><div>> */</div><div>> </div><div>> #include "nir.h"</div><div>> </div><div>> struct deref_node {</div><div>> struct deref_node *parent;</div><div>> const struct glsl_type *type;</div><div>> </div><div>> bool lower_to_ssa;</div><div>> </div><div>> struct set *loads;</div><div>> struct set *stores;</div><div>> struct set *copies;</div><div>> </div><div>> nir_ssa_def **def_stack;</div><div>> nir_ssa_def **def_stack_tail;</div><div>> </div><div>> struct deref_node *wildcard;</div><div>> struct deref_node *indirect;</div><div>> struct deref_node *children[0];</div><div>> };</div><div>> </div><div>> struct lower_variables_state {</div><div>> void *mem_ctx;</div><div>> void *dead_ctx;</div><div>> nir_function_impl *impl;</div><span class=""><div>> </div><div>> /* A hash table mapping variables to deref_node data */</div></span><div>> struct hash_table *deref_var_nodes;</div><div>> </div><div>> /* A hash table mapping dereference leaves to deref_node data. A deref</div><div>> * is considered a leaf if it is fully-qualified (no wildcards) and</div><div>> * direct. In short, these are the derefs we can actually consider</div><div>> * lowering to SSA values.</div><div>> */</div><div><br></div><div>I was mislead by the "leaf" name the first time I reviewed this; having a comment explaining what it does helps, but I still think that it's a pretty misleading name. "Leaf," at least to me, implies that it's a leaf of the dereference tree, which in this case isn't true unless I'm missing something here. I think "direct" is the right term here. You already say that dereference leaves are "fully qualified (no wild cards) and direct" -- I would just call something potentially with wildcards + direct references "not indirect," and then what you're calling "leaf" becomes just "direct."</div></div></blockquote><div><br></div><div>Yes, "leaf" is a crappy name, but I'm not sure I like the distinction between "direct" and "not indirect" either. A double-negative should be a positive. I'm all for a better name, I just haven't found one.<br></div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div>Also, it was only after reading through the entire thing and thinking about it that I realized why we have this hash table -- it's because we can only figure out if direct dereferences alias indirect dereferences, so we only consider lowering them. Of course, we may have to expand some wildcard copies since they're copying the direct dereference we're trying to lower to SSA, but otherwise we don't care about them. Idk if you want to explain this here.</div></div></blockquote><div><br></div><div>Yes, a comment about that would probably be a good plan.<br></div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div><br></div><div>> struct hash_table *deref_leaves;</div><span class=""><div>> </div><div>> /* A hash table mapping phi nodes to deref_state data */</div></span><div>> struct hash_table *phi_table;</div><div>> };</div><span class=""><div>> </div><div>> /* The following two functions implement a hash and equality check for</div></span><span class=""><div>> * variable dreferences. When the hash or equality function encounters an</div></span><span class=""><div>> * array, all indirects are treated as equal and are never equal to a</div></span><span class=""><div>> * direct dereference or a wildcard.</div></span><div>> *</div><div>> * Some of the magic numbers here were taken from _mesa_hash_data and one</div><div>> * was just a big prime I found on the internet.</div><div>> */</div><div>> static uint32_t</div><div>> hash_deref(const void *void_deref)</div><div>> {</div><span class=""><div>> const nir_deref *deref = void_deref;</div><div>> </div></span><div>> uint32_t hash;</div><div>> if (deref->child) {</div><div>> hash = hash_deref(deref->child);</div><div>> } else {</div><div>> hash = 2166136261ul;</div><div>> }</div><div>> </div><div>> switch (deref->deref_type) {</div><div>> case nir_deref_type_var:</div><div>> hash ^= _mesa_hash_pointer(nir_deref_as_var(deref)->var);</div><div>> break;</div><div>> case nir_deref_type_array: {</div><div>> nir_deref_array *array = nir_deref_as_array(deref);</div><span class=""><div>> hash += 268435183 * array->deref_array_type;</div></span><span class=""><div>> if (array->deref_array_type == nir_deref_array_type_direct)</div></span><span class=""><div>> hash ^= array->base_offset; /* Some prime */</div></span><div>> break;</div><div>> }</div><div>> case nir_deref_type_struct:</div><div>> hash ^= nir_deref_as_struct(deref)->index;</div><div>> break;</div><div>> }</div><div>> </div><div>> return hash * 0x01000193;</div><div>> }</div><div><br></div><div>I know I complained about this already but... could we just use _mesa_hash_data directly here, or at least properly implement FNV-1a here? It should be the same or less code either way (FNV-1a is really, really easy), and we can use something with a (presumably) solid mathematical basis that's been proven in practice. To me, that sounds better than multiplying by random primes found on the Internet and praying that we get a good distribution.</div></div></blockquote><div><br></div><div>Meh... I'm not convinced it's hurting us, but I also don't care. I'll add the stuff to hash_table.h<br></div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div>Another orthogonal thing which I may have also mentioned before, is that I think we should be using a for loop here. C doesn't have guaranteed optimized tail recursion, so when people (including me) see a function that uses recursion, we assume it's because it's doing something that couldn't be done using a simple loop, which isn't true here. I don't think recursion makes it more readable, since walking linked lists using a loop is already a pretty common thing for us to do, and that's exactly what we're doing here. So IMHO, a while loop is more readable here. In addition, a number of things that I wrote to traverse dereference lists are using a loop already, and so does get_deref_node() below.</div></div></blockquote><div><br></div><div>That should be easy enough too.<br></div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div><br></div><div>> </div><div>> static bool</div><span class=""><div>> derefs_equal(const void *void_a, const void *void_b)</div></span><div>> {</div><span class=""><div>> const nir_deref *a = void_a;</div></span><span class=""><div>> const nir_deref *b = void_b;</div><div>> </div></span><span class=""><div>> if (a->deref_type != b->deref_type)</div></span><div>> return false;</div><div>> </div><div>> switch (a->deref_type) {</div><div>> case nir_deref_type_var:</div><span class=""><div>> if (nir_deref_as_var(a)->var != nir_deref_as_var(b)->var)</div></span><div>> return false;</div><div>> break;</div><div>> case nir_deref_type_array: {</div><div>> nir_deref_array *a_arr = nir_deref_as_array(a);</div><div>> nir_deref_array *b_arr = nir_deref_as_array(b);</div><span class=""><div>> </div><div>> if (a_arr->deref_array_type != b_arr->deref_array_type)</div></span><div>> return false;</div><span class=""><div>> </div><div>> if (a_arr->deref_array_type == nir_deref_array_type_direct &&</div></span><span class=""><div>> a_arr->base_offset != b_arr->base_offset)</div></span><div>> return false;</div><div>> break;</div><div>> }</div><div>> case nir_deref_type_struct:</div><span class=""><div>> if (nir_deref_as_struct(a)->index != nir_deref_as_struct(b)->index)</div></span><div>> return false;</div><div>> break;</div><div>> default:</div><span class=""><div>> unreachable("Invalid dreference type");</div><div>> }</div><div>> </div></span><span class=""><div>> assert((a->child == NULL) == (b->child == NULL));</div></span><div>> if (a->child)</div><span class=""><div>> return derefs_equal(a->child, b->child);</div></span><div>> else</div><div>> return true;</div><div>> }</div><div><br></div><div>Same comment about using a loop here.</div><div><br></div><div>> </div><div>> static int</div><div>> type_get_length(const struct glsl_type *type)</div><div>> {</div><div>> switch (glsl_get_base_type(type)) {</div><div>> case GLSL_TYPE_STRUCT:</div><div>> case GLSL_TYPE_ARRAY:</div><div>> return glsl_get_length(type);</div><div>> case GLSL_TYPE_FLOAT:</div><div>> case GLSL_TYPE_INT:</div><div>> case GLSL_TYPE_UINT:</div><div>> case GLSL_TYPE_BOOL:</div><div>> if (glsl_type_is_matrix(type))</div><div>> return glsl_get_matrix_columns(type);</div><div>> else</div><div>> return glsl_get_vector_elements(type);</div><div>> default:</div><span class=""><div>> unreachable("Invalid deref base type");</div><div>> }</div><div>> }</div><div>> </div></span><div>> static struct deref_node *</div><div>> deref_node_create(struct deref_node *parent,</div><span class=""><div>> const struct glsl_type *type, void *mem_ctx)</div></span><div>> {</div><span class=""><div>> size_t size = sizeof(struct deref_node) +</div></span><span class=""><div>> type_get_length(type) * sizeof(struct deref_node *);</div><div>> </div></span><span class=""><div>> struct deref_node *node = rzalloc_size(mem_ctx, size);</div></span><div>> node->type = type;</div><div>> node->parent = parent;</div><div>> </div><div>> return node;</div><div>> }</div><div>> </div><div>> /* Gets the deref_node for the given deref chain and creates it if it</div><div>> * doesn't yet exist. If the deref is a leaf (fully-qualified and direct)</div><div><br></div><div>(a leaf -> direct)</div><div><br></div><div>> * and add_to_leaves is true, it will be added to the hash table of leaves.</div><div>> */</div><div>> static struct deref_node *</div><div>> get_deref_node(nir_deref_var *deref, bool add_to_leaves,</div><div>> struct lower_variables_state *state)</div><div>> {</div><div>> bool is_leaf = true;</div><span class=""><div>> struct deref_node *parent = NULL;</div><div><br></div></span><div>If it were me, I'd call this node_parent to emphasize that it's a node and not a dereference. I got confused and had to look at the definition again to realize that.</div><span class=""><div><br></div><div>> nir_deref *tail = &deref->deref;</div></span><div>> while (tail) {</div><div><br></div><div>Maybe use a for loop instead? Something like</div><div><br></div><div>for (nir_deref *tail = &deref->deref; tail; tail = tail->child)</div><div><br></div><div>to be more up-front that we're doing.</div><div><br></div><div>> struct deref_node *node;</div><div>> </div><div>> switch (tail->deref_type) {</div><div>> case nir_deref_type_var: {</div><div>> assert(tail == &deref->deref);</div><div>> assert(parent == NULL);</div><span class=""><div>> </div><div>> uint32_t var_hash = _mesa_hash_pointer(deref->var);</div></span><div>> struct hash_entry *entry =</div><div>> _mesa_hash_table_search(state->deref_var_nodes,</div><div>> var_hash, deref->var);</div><div>> if (entry) {</div><div>> node = entry->data;</div><div>> } else {</div><span class=""><div>> node = deref_node_create(NULL, tail->type, state->dead_ctx);</div></span><div>> _mesa_hash_table_insert(state->deref_var_nodes,</div><div>> var_hash, deref->var, node);</div><div>> }</div><div>> break;</div><div>> }</div><div><br></div><div>Since this case is guaranteed to happen only once at the beginning of the chain, it might make more sense to move it above the loop to make that more explicit rather than having the asserts. This will also make the other parent != NULL asserts unnecessary since parent will never be set to NULL.</div></div></blockquote><div><br></div><div>Yeah, I'll see what I can do about reworking this. I kind of like my recursion, but loops are ok too.<br></div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div><br></div><div>> </div><div>> case nir_deref_type_struct: {</div><div>> assert(parent != NULL);</div><div>> </div><div>> nir_deref_struct *deref_struct = nir_deref_as_struct(tail);</div><div>> assert(deref_struct->index < type_get_length(parent->type));</div><span class=""><div>> if (parent->children[deref_struct->index]) {</div></span><span class=""><div>> node = parent->children[deref_struct->index];</div></span><div>> } else {</div><span class=""><div>> node = deref_node_create(parent, tail->type, state->dead_ctx);</div></span><span class=""><div>> parent->children[deref_struct->index] = node;</div><div>> }</div></span><div>> break;</div><div>> }</div><div>> </div><div>> case nir_deref_type_array: {</div><div>> assert(parent != NULL);</div><div>> </div><div>> nir_deref_array *arr = nir_deref_as_array(tail);</div><div>> switch (arr->deref_array_type) {</div><div>> case nir_deref_array_type_direct:</div><span class=""><div>> if (arr->base_offset >= type_get_length(parent->type)) {</div></span><span class=""><div>> /* This is possible if a loop unrolls and generates an</div></span><span class=""><div>> * out-of-bounds offset. We need to handle this at least</div></span><div>> * somewhat gracefully.</div><div>> */</div><div>> return NULL;</div><span class=""><div>> } else if (parent->children[arr->base_offset]) {</div></span><span class=""><div>> node = parent->children[arr->base_offset];</div></span><div>> } else {</div><span class=""><div>> node = deref_node_create(parent, tail->type, state->dead_ctx);</div></span><span class=""><div>> parent->children[arr->base_offset] = node;</div><div>> }</div></span><div>> break;</div><div>> case nir_deref_array_type_indirect:</div><div>> if (parent->indirect) {</div><div>> node = parent->indirect;</div><div>> } else {</div><span class=""><div>> node = deref_node_create(parent, tail->type, state->dead_ctx);</div></span><div>> parent->indirect = node;</div><div>> }</div><div>> is_leaf = false;</div><div>> break;</div><div>> case nir_deref_array_type_wildcard:</div><div>> if (parent->wildcard) {</div><div>> node = parent->wildcard;</div><div>> } else {</div><span class=""><div>> node = deref_node_create(parent, tail->type, state->dead_ctx);</div></span><div>> parent->wildcard = node;</div><div>> }</div><div>> is_leaf = false;</div><div>> break;</div><div>> default:</div><span class=""><div>> unreachable("Invalid array deref type");</div><div>> }</div></span><div>> break;</div><div>> }</div><div>> default:</div><span class=""><div>> unreachable("Invalid deref type");</div><div>> }</div><div>> </div></span><div>> parent = node;</div><div>> tail = tail->child;</div><div>> }</div><div>> </div><div>> assert(parent);</div><div>> </div><div>> if (is_leaf && add_to_leaves)</div><div>> _mesa_hash_table_insert(state->deref_leaves,</div><div>> hash_deref(deref), deref, parent);</div><div>> </div><div>> return parent;</div><div>> }</div><div>> </div><div>> /* \sa foreach_deref_node_match */</div><div>> static bool</div><div>> foreach_deref_node_worker(struct deref_node *node, nir_deref *deref,</div><span class=""><div>> bool (* cb)(struct deref_node *node,</div></span><div>> struct lower_variables_state *state),</div><div>> struct lower_variables_state *state)</div><div>> {</div><span class=""><div>> if (deref->child == NULL) {</div></span><div>> return cb(node, state);</div><div>> } else {</div><div>> switch (deref->child->deref_type) {</div><div>> case nir_deref_type_array: {</div><span class=""><div>> nir_deref_array *arr = nir_deref_as_array(deref->child);</div></span><div>> assert(arr->deref_array_type == nir_deref_array_type_direct);</div><span class=""><div>> if (node->children[arr->base_offset] &&</div></span><span class=""><div>> !foreach_deref_node_worker(node->children[arr->base_offset],</div></span><div>> deref->child, cb, state))</div><div>> return false;</div><div>> </div><div>> if (node->wildcard &&</div><div>> !foreach_deref_node_worker(node->wildcard,</div><div>> deref->child, cb, state))</div><div>> return false;</div><div>> </div><div>> return true;</div><div>> }</div><div>> </div><div>> case nir_deref_type_struct: {</div><span class=""><div>> nir_deref_struct *str = nir_deref_as_struct(deref->child);</div></span><span class=""><div>> return foreach_deref_node_worker(node->children[str->index],</div></span><div>> deref->child, cb, state);</div><div>> }</div><div>> </div><div>> default:</div><span class=""><div>> unreachable("Invalid deref child type");</div><div>> }</div><div>> }</div><div>> }</div><div>> </div></span><div>> /* Walks over every "matching" deref_node and calls the callback. A node</div><div>> * is considered to "match" if either refers to that deref or matches up t</div><div>> * a wildcard. In other words, the following would match a[6].foo[3].bar:</div><div>> *</div><div>> * a[6].foo[3].bar</div><div>> * a[*].foo[3].bar</div><div>> * a[6].foo[*].bar</div><div>> * a[*].foo[*].bar</div><div>> *</div><div>> * The given deref must be a full-length and fully qualified (no wildcards</div><div>> * or indirexcts) deref chain.</div><div><br></div><div>indirects</div><div><br></div><div>> */</div><div>> static bool</div><div>> foreach_deref_node_match(nir_deref_var *deref,</div><span class=""><div>> bool (* cb)(struct deref_node *node,</div></span><div>> struct lower_variables_state *state),</div><div>> struct lower_variables_state *state)</div><div>> {</div><div>> nir_deref_var var_deref = *deref;</div><div>> var_deref.deref.child = NULL;</div><span class=""><div>> struct deref_node *node = get_deref_node(&var_deref, false, state);</div><div>> </div></span><div>> if (node == NULL)</div><div>> return false;</div><span class=""><div>> </div><div>> return foreach_deref_node_worker(node, &deref->deref, cb, state);</div><div>> }</div><div>> </div></span><div>> /* \sa deref_may_be_aliased */</div><div>> static bool</div><div>> deref_may_be_aliased_node(struct deref_node *node, nir_deref *deref,</div><div>> struct lower_variables_state *state)</div><div>> {</div><span class=""><div>> if (deref->child == NULL) {</div></span><div>> return false;</div><div>> } else {</div><div>> switch (deref->child->deref_type) {</div><div>> case nir_deref_type_array: {</div><span class=""><div>> nir_deref_array *arr = nir_deref_as_array(deref->child);</div></span><span class=""><div>> if (arr->deref_array_type == nir_deref_array_type_indirect)</div></span><div>> return true;</div><div>> </div><div>> assert(arr->deref_array_type == nir_deref_array_type_direct);</div><span class=""><div>> </div><div>> if (node->children[arr->base_offset] &&</div></span><span class=""><div>> deref_may_be_aliased_node(node->children[arr->base_offset],</div></span><div>> deref->child, state))</div><div>> return true;</div><div>> </div><div>> if (node->wildcard &&</div><span class=""><div>> deref_may_be_aliased_node(node->wildcard, deref->child, state))</div></span><div>> return true;</div><div>> </div><div>> return false;</div><div>> }</div><div>> </div><div>> case nir_deref_type_struct: {</div><span class=""><div>> nir_deref_struct *str = nir_deref_as_struct(deref->child);</div></span><span class=""><div>> if (node->children[str->index]) {</div></span><span class=""><div>> return deref_may_be_aliased_node(node->children[str->index],</div></span><div>> deref->child, state);</div><div>> } else {</div><div>> return false;</div><div>> }</div><div>> }</div><div>> </div><div>> default:</div><span class=""><div>> unreachable("Invalid nir_deref child type");</div><div>> }</div><div>> }</div><div>> }</div><div>> </div></span><div>> /* Returns true if there are no indirects that can ever touch this deref.</div><div>> * This question can only be asked about fully-qualified derefs.</div><div>> * Obviously, it's pointless to ask this about indirects, but we also</div><div>> * rule-out wildcards. For example, if the given deref is a[6].foo, then</div><div><br></div><div>I think I'd like to see this spelled out a little more. Maybe something like:</div><div><br></div><div>We don't handle wildcard dereferences, since that would involve checking each array index to make sure that there aren't any indirect references.</div><div><br></div><div>I couldn't think of anything that would allow us to figure out if dereferences with wildcards aliased any indirect dereferences without having to check every array index (which seems like a bad idea). I guess it's not clear if we should do that anyways, since it might help cases like these:</div><div><br></div><div>temp[*].thing = foo[*].thing;</div><div>bar[*].thing = temp[*].thing;</div><div><br></div><div>since after lowering temp[*].thing to SSA we can recognize the copy and turn it into:</div><div>bar[*].thing = foo[*].thing</div><div><br></div><div>but for more complicated cases like:</div><div>if (...) {</div><div> temp[*].thing = foo[*].thing;</div><div>} else {</div><div> temp[*].thing = bar[*].thing;</div><div>}</div><div>baz[*].thing = temp[*].thing;</div><div><br></div><div>it'll just kill our register pressure if temp is large enough... and this is all very theoretical anyways.</div><div><br></div><div><br></div><div>> * any uses of a[i].foo would case this to return false, but a[i].bar would</div><div>> * not affect it because it's a different structure member. A var_copy</div><div>> * involving of a[*].bar also doesn't affect it because that can be lowered</div><div>> * to entirely direct load/stores.</div><div>> */</div><div>> static bool</div><div>> deref_may_be_aliased(nir_deref_var *deref,</div><div>> struct lower_variables_state *state)</div><div>> {</div><div>> nir_deref_var var_deref = *deref;</div><div>> var_deref.deref.child = NULL;</div><span class=""><div>> struct deref_node *node = get_deref_node(&var_deref, false, state);</div><div><br></div></span><div>We're only looking for the root node corresponding to the variable, and we've already populated the deref_var_nodes table, so we can just look it up here.</div><span class=""><div><br></div><div>> </div><div>> /* An invalid dereference can't be aliased. */</div></span><div>> if (node == NULL)</div><div>> return false;</div><span class=""><div>> </div><div>> return deref_may_be_aliased_node(node, &deref->deref, state);</div><div>> }</div><div>> </div></span><div>> static void</div><div>> register_load_instr(nir_intrinsic_instr *load_instr, bool create_node,</div><div>> struct lower_variables_state *state)</div><div>> {</div><span class=""><div>> struct deref_node *node = get_deref_node(load_instr->variables[0],</div></span><div>> create_node, state);</div><div>> if (node == NULL)</div><div>> return;</div><span class=""><div>> </div><div>> if (node->loads == NULL)</div></span><span class=""><div>> node->loads = _mesa_set_create(state->dead_ctx,</div></span><div>> _mesa_key_pointer_equal);</div><div>> </div><div>> _mesa_set_add(node->loads, _mesa_hash_pointer(load_instr), load_instr);</div><div>> }</div><div>> </div><div>> static void</div><div>> register_store_instr(nir_intrinsic_instr *store_instr, bool create_node,</div><div>> struct lower_variables_state *state)</div><div>> {</div><span class=""><div>> struct deref_node *node = get_deref_node(store_instr->variables[0],</div></span><div>> create_node, state);</div><div>> if (node == NULL)</div><div>> return;</div><span class=""><div>> </div><div>> if (node->stores == NULL)</div></span><span class=""><div>> node->stores = _mesa_set_create(state->dead_ctx,</div></span><div>> _mesa_key_pointer_equal);</div><div>> </div><div>> _mesa_set_add(node->stores, _mesa_hash_pointer(store_instr), store_instr);</div><div>> }</div><div><br></div><div>I would call these "add_load_instr" and "add_store_instr" since "register" already has another meaning. I don't have a strong preference though.</div><div><br></div><div>> </div><div>> static void</div><div>> register_copy_instr(nir_intrinsic_instr *copy_instr, bool create_node,</div><div>> struct lower_variables_state *state)</div><div><br></div><div>Same here...</div><div><br></div><div>> {</div><span class=""><div>> for (unsigned idx = 0; idx < 2; idx++) {</div></span><span class=""><div>> struct deref_node *node = get_deref_node(copy_instr->variables[idx],</div></span><div>> create_node, state);</div><div>> if (node == NULL)</div><div>> continue;</div><span class=""><div>> </div><div>> if (node->copies == NULL)</div></span><span class=""><div>> node->copies = _mesa_set_create(state->dead_ctx,</div></span><div>> _mesa_key_pointer_equal);</div><div>> </div><div>> _mesa_set_add(node->copies, _mesa_hash_pointer(copy_instr), copy_instr);</div><div>> }</div><div>> }</div><div>> </div><div>> /* Registers all variable uses in the given block. */</div><div>> static bool</div><div>> register_variable_uses_block(nir_block *block, void *void_state)</div><div><br></div><div>and here.</div><div><br></div><div>> {</div><span class=""><div>> struct lower_variables_state *state = void_state;</div><div>> </div></span><div>> nir_foreach_instr_safe(block, instr) {</div><span class=""><div>> if (instr->type != nir_instr_type_intrinsic)</div></span><div>> continue;</div><div>> </div><div>> nir_intrinsic_instr *intrin = nir_instr_as_intrinsic(instr);</div><div>> </div><div>> switch (intrin->intrinsic) {</div><div>> case nir_intrinsic_load_var:</div><div>> register_load_instr(intrin, true, state);</div><div>> break;</div><div>> </div><div>> case nir_intrinsic_store_var:</div><div>> register_store_instr(intrin, true, state);</div><div>> break;</div><div>> </div><div>> case nir_intrinsic_copy_var:</div><div>> register_copy_instr(intrin, true, state);</div><div>> break;</div><div>> </div><div>> default:</div><div>> continue;</div><div>> }</div><div>> }</div><div>> </div><div>> return true;</div><div>> }</div><div>> </div><div>> /* Walks down the deref chain and returns the next deref in the chain whose</div><div>> * child is a wildcard. In other words, given the chain a[1].foo[*].bar,</div><div>> * this function will return the deref to foo. Calling it a second time</div><div>> * with the [*].bar, it will return NULL.</div><div>> */</div><div>> static nir_deref *</div><div>> deref_next_wildcard_parent(nir_deref *deref)</div><div>> {</div><span class=""><div>> for (nir_deref *tail = deref; tail->child; tail = tail->child) {</div></span><span class=""><div>> if (tail->child->deref_type != nir_deref_type_array)</div></span><div>> continue;</div><span class=""><div>> </div><div>> nir_deref_array *arr = nir_deref_as_array(tail->child);</div><div>> </div></span><span class=""><div>> if (arr->deref_array_type == nir_deref_array_type_wildcard)</div></span><div>> return tail;</div><div>> }</div><div>> </div><div>> return NULL;</div><div>> }</div><div>> </div><div>> /* Returns the last deref in the chain.</div><div>> */</div><div>> static nir_deref *</div><div>> get_deref_tail(nir_deref *deref)</div><div>> {</div><div>> while (deref->child)</div><div>> deref = deref->child;</div><div>> </div><div>> return deref;</div><div>> }</div><div>> </div><div>> /* This function recursively walks the given deref chain and replaces the</div><div>> * given copy instruction with an equivalent sequence load/store</div><div>> * operations.</div><div>> *</div><div>> * @copy_instr The copy instruction to replace; new instructions will be</div><div>> * inserted before this one</div><div>> *</div><div>> * @dest_head The head of the destination variable deref chain</div><div>> *</div><div>> * @src_head The head of the source variable deref chain</div><div>> *</div><div>> * @dest_tail The current tail of the destination variable deref chain;</div><div>> * this is used for recursion and external callers of this</div><div>> * function should call it with tail == head</div><div>> *</div><div>> * @src_tail The current tail of the source variable deref chain;</div><div>> * this is used for recursion and external callers of this</div><div>> * function should call it with tail == head</div><div>> *</div><div>> * @state The current variable lowering state</div><div>> */</div><div>> static void</div><div>> emit_copy_load_store(nir_intrinsic_instr *copy_instr,</div><div>> nir_deref_var *dest_head, nir_deref_var *src_head,</div><div>> nir_deref *dest_tail, nir_deref *src_tail,</div><div>> struct lower_variables_state *state)</div><div>> {</div><div>> /* Find the next pair of wildcards */</div><div>> nir_deref *src_arr_parent = deref_next_wildcard_parent(src_tail);</div><div>> nir_deref *dest_arr_parent = deref_next_wildcard_parent(dest_tail);</div><div>> </div><div>> if (src_arr_parent || dest_arr_parent) {</div><div>> /* Wildcards had better come in matched pairs */</div><div>> assert(dest_arr_parent && dest_arr_parent);</div><span class=""><div>> </div><div>> nir_deref_array *src_arr = nir_deref_as_array(src_arr_parent->child);</div></span><span class=""><div>> nir_deref_array *dest_arr = nir_deref_as_array(dest_arr_parent->child);</div><div>> </div></span><span class=""><div>> unsigned length = type_get_length(src_arr_parent->type);</div></span><div>> /* The wildcards should represent the same number of elements */</div><div>> assert(length == type_get_length(dest_arr_parent->type));</div><div>> assert(length > 0);</div><div>> </div><div>> /* Walk over all of the elements that this wildcard refers to and</div><div>> * call emit_copy_load_store on each one of them */</div><div>> src_arr->deref_array_type = nir_deref_array_type_direct;</div><div>> dest_arr->deref_array_type = nir_deref_array_type_direct;</div><span class=""><div>> for (unsigned i = 0; i < length; i++) {</div></span><div>> src_arr->base_offset = i;</div><div>> dest_arr->base_offset = i;</div><div>> emit_copy_load_store(copy_instr, dest_head, src_head,</div><span class=""><div>> &dest_arr->deref, &src_arr->deref, state);</div><div>> }</div></span><div>> src_arr->deref_array_type = nir_deref_array_type_wildcard;</div><div>> dest_arr->deref_array_type = nir_deref_array_type_wildcard;</div><div>> } else {</div><div>> /* In this case, we have no wildcards anymore, so all we have to do</div><div>> * is just emit the load and store operations. */</div><div>> src_tail = get_deref_tail(src_tail);</div><div>> dest_tail = get_deref_tail(dest_tail);</div><span class=""><div>> </div><div>> assert(src_tail->type == dest_tail->type);</div><div>> </div></span><span class=""><div>> unsigned num_components = glsl_get_vector_elements(src_tail->type);</div><div>> </div></span><span class=""><div>> nir_deref *src_deref = nir_copy_deref(state->mem_ctx, &src_head->deref);</div></span><span class=""><div>> nir_deref *dest_deref = nir_copy_deref(state->mem_ctx, &dest_head->deref);</div><div>> </div></span><div>> nir_intrinsic_instr *load =</div><div>> nir_intrinsic_instr_create(state->mem_ctx, nir_intrinsic_load_var);</div><div>> load->num_components = num_components;</div><span class=""><div>> load->variables[0] = nir_deref_as_var(src_deref);</div></span><div>> load->dest.is_ssa = true;</div><span class=""><div>> nir_ssa_def_init(&load->instr, &load->dest.ssa, num_components, NULL);</div><div>> </div></span><div>> nir_instr_insert_before(©_instr->instr, &load->instr);</div><div>> register_load_instr(load, false, state);</div><div>> </div><div>> nir_intrinsic_instr *store =</div><div>> nir_intrinsic_instr_create(state->mem_ctx, nir_intrinsic_store_var);</div><div>> store->num_components = num_components;</div><span class=""><div>> store->variables[0] = nir_deref_as_var(dest_deref);</div></span><span class=""><div>> store->src[0].is_ssa = true;</div></span><span class=""><div>> store->src[0].ssa = &load->dest.ssa;</div><div>> </div></span><div>> nir_instr_insert_before(©_instr->instr, &store->instr);</div><div>> register_store_instr(store, false, state);</div><div>> }</div><div>> }</div><div>> </div><div>> /* Walks over all of the copy instructions to or from the given deref_node</div><div>> * and lowers them to load/store intrinsics.</div><div>> */</div><div>> static bool</div><div>> lower_copies_to_load_store(struct deref_node *node,</div><div>> struct lower_variables_state *state)</div><div>> {</div><div>> if (!node->copies)</div><div>> return true;</div><div>> </div><div>> struct set_entry *copy_entry;</div><div>> set_foreach(node->copies, copy_entry) {</div><span class=""><div>> nir_intrinsic_instr *copy = (void *)copy_entry->key;</div><div>> </div></span><span class=""><div>> emit_copy_load_store(copy, copy->variables[0], copy->variables[1],</div></span><span class=""><div>> ©->variables[0]->deref,</div></span><span class=""><div>> ©->variables[1]->deref,</div></span><div>> state);</div><span class=""><div>> </div><div>> for (unsigned i = 0; i < 2; ++i) {</div></span><span class=""><div>> struct deref_node *arg_node = get_deref_node(copy->variables[i],</div></span><div>> false, state);</div><div>> if (arg_node == NULL)</div><div>> continue;</div><span class=""><div>> </div><div>> struct set_entry *arg_entry = _mesa_set_search(arg_node->copies,</div></span><div>> copy_entry->hash,</div><div>> copy);</div><div>> assert(arg_entry);</div><div>> _mesa_set_remove(node->copies, arg_entry);</div><div>> }</div><div>> </div><div>> nir_instr_remove(©->instr);</div><div>> }</div><div>> </div><div>> return true;</div><div>> }</div><div>> </div><div>> /* Returns a load_const instruction that represents the constant</div><div>> * initializer for the given deref chain. The caller is responsible for</div><div>> * ensuring that there actually is a constant initializer.</div><div>> */</div><div>> static nir_load_const_instr *</div><div>> get_const_initializer_load(const nir_deref_var *deref,</div><div>> struct lower_variables_state *state)</div><div>> {</div><span class=""><div>> nir_constant *constant = deref->var->constant_initializer;</div></span><span class=""><div>> const nir_deref *tail = &deref->deref;</div></span><div>> unsigned matrix_offset = 0;</div><div>> while (tail->child) {</div><div>> switch (tail->child->deref_type) {</div><div>> case nir_deref_type_array: {</div><span class=""><div>> nir_deref_array *arr = nir_deref_as_array(tail->child);</div></span><div>> assert(arr->deref_array_type == nir_deref_array_type_direct);</div><div>> if (glsl_type_is_matrix(tail->type)) {</div><div>> assert(arr->deref.child == NULL);</div><div>> matrix_offset = arr->base_offset;</div><div>> } else {</div><span class=""><div>> constant = constant->elements[arr->base_offset];</div><div>> }</div></span><div>> break;</div><div>> }</div><div>> </div><div>> case nir_deref_type_struct: {</div><span class=""><div>> constant = constant->elements[nir_deref_as_struct(tail->child)->index];</div></span><div>> break;</div><div>> }</div><div>> </div><div>> default:</div><span class=""><div>> unreachable("Invalid deref child type");</div><div>> }</div><div>> </div></span><div>> tail = tail->child;</div><div>> }</div><div>> </div><div>> nir_load_const_instr *load =</div><div>> nir_load_const_instr_create(state->mem_ctx,</div><div>> glsl_get_vector_elements(tail->type));</div><div>> </div><div>> matrix_offset *= load->def.num_components;</div><div>> for (unsigned i = 0; i < load->def.num_components; i++) {</div><div>> switch (glsl_get_base_type(tail->type)) {</div><div>> case GLSL_TYPE_FLOAT:</div><div>> case GLSL_TYPE_INT:</div><div>> case GLSL_TYPE_UINT:</div><span class=""><div>> load->value.u[i] = constant->value.u[matrix_offset + i];</div></span><div>> break;</div><div>> case GLSL_TYPE_BOOL:</div><div>> load->value.u[i] = constant->value.b[matrix_offset + i] ?</div><div>> NIR_TRUE : NIR_FALSE;</div><div><br></div><div>I see you sneakily squashed in my fix :P</div><div><br></div><div>> break;</div><div>> default:</div><span class=""><div>> unreachable("Invalid immediate type");</div><div>> }</div><div>> }</div><div>> </div></span><div>> return load;</div><div>> }</div><div>> </div><div>> /** Pushes an SSA def onto the def stack for the given node</div><div>> *</div><div>> * Each node is potentially associated with a stack of SSA definitions.</div><div>> * This stack is used for determining what SSA definition reaches a given</div><div>> * point in the program for variable renaming. The stack is always kept in</div><div>> * dominance-order with at most one SSA def per block. If the SSA</div><div>> * definition on the top of the stack is in the same block as the one being</div><div>> * pushed, the top element is replaced.</div><div>> */</div><div>> static void</div><div>> def_stack_push(struct deref_node *node, nir_ssa_def *def,</div><div>> struct lower_variables_state *state)</div><div>> {</div><span class=""><div>> if (node->def_stack == NULL) {</div></span><span class=""><div>> node->def_stack = ralloc_array(state->dead_ctx, nir_ssa_def *,</div></span><div>> state->impl->num_blocks);</div><span class=""><div>> node->def_stack_tail = node->def_stack - 1;</div><div>> }</div><div>> </div></span><span class=""><div>> if (node->def_stack_tail >= node->def_stack) {</div></span><span class=""><div>> nir_ssa_def *top_def = *node->def_stack_tail;</div><div>> </div></span><span class=""><div>> if (def->parent_instr->block == top_def->parent_instr->block) {</div></span><span class=""><div>> /* They're in the same block, just replace the top */</div></span><div>> *node->def_stack_tail = def;</div><div>> return;</div><div>> }</div><div>> }</div><div>> </div><div>> *(++node->def_stack_tail) = def;</div><div>> }</div><div>> </div><div>> /* Pop the top of the def stack if it's in the given block */</div><div>> static void</div><div>> def_stack_pop_if_in_block(struct deref_node *node, nir_block *block)</div><div>> {</div><div>> /* If we're popping, then we have presumably pushed at some time in the</div><div>> * past so this should exist.</div><div>> */</div><div>> assert(node->def_stack != NULL);</div><div>> </div><div>> /* The stack is already empty. Do nothing. */</div><div>> if (node->def_stack_tail < node->def_stack)</div><div>> return;</div><span class=""><div>> </div><div>> nir_ssa_def *def = *node->def_stack_tail;</div></span><div>> if (def->parent_instr->block == block)</div><div>> node->def_stack_tail--;</div><div>> }</div><div>> </div><div>> /** Retrieves the SSA definition on the top of the stack for the given</div><div>> * node, if one exists. If the stack is empty, then we return the constant</div><div>> * initializer (if it exists) or an SSA undef.</div><div>> */</div><div>> static nir_ssa_def *</div><div>> get_ssa_def_for_block(struct deref_node *node, nir_block *block,</div><div>> struct lower_variables_state *state)</div><div>> {</div><div>> /* If we have something on the stack, go ahead and return it. We're</div><div>> * assuming that the top of the stack dominates the given block.</div><div>> */</div><div>> if (node->def_stack && node->def_stack_tail >= node->def_stack)</div><div>> return *node->def_stack_tail;</div><span class=""><div>> </div><div>> /* If we got here then we don't have a definition that dominates the</div></span><span class=""><div>> * given block. This means that we need to add an undef and use that.</div></span><div>> */</div><div>> nir_ssa_undef_instr *undef =</div><div>> nir_ssa_undef_instr_create(state->mem_ctx,</div><div>> glsl_get_vector_elements(node->type));</div><span class=""><div>> nir_instr_insert_before_cf_list(&state->impl->body, &undef->instr);</div></span><div>> def_stack_push(node, &undef->def, state);</div><div>> return &undef->def;</div><div>> }</div><div>> </div><div>> /* Given a block and one of its predecessors, this function fills in the</div><div>> * souces of the phi nodes to take SSA defs from the given predecessor.</div><div><br></div><div>sources</div><div><br></div><div>> * This function must be called exactly once per block/predecessor pair.</div><div>> */</div><div>> static void</div><div>> add_phi_sources(nir_block *block, nir_block *pred,</div><div>> struct lower_variables_state *state)</div><div>> {</div><div>> nir_foreach_instr(block, instr) {</div><span class=""><div>> if (instr->type != nir_instr_type_phi)</div></span><div>> break;</div><div>> </div><div>> nir_phi_instr *phi = nir_instr_as_phi(instr);</div><div>> </div><div>> struct hash_entry *entry =</div><div>> _mesa_hash_table_search(state->phi_table,</div><div>> _mesa_hash_pointer(phi), phi);</div><div>> if (!entry)</div><div>> continue;</div><span class=""><div>> </div><div>> struct deref_node *node = entry->data;</div><div>> </div></span><span class=""><div>> nir_phi_src *src = ralloc(state->mem_ctx, nir_phi_src);</div></span><div>> src->pred = pred;</div><div>> src->src.is_ssa = true;</div><span class=""><div>> src->src.ssa = get_ssa_def_for_block(node, pred, state);</div><div>> </div></span><span class=""><div>> _mesa_set_add(src->src.ssa->uses, _mesa_hash_pointer(instr), instr);</div><div>> </div></span><div>> exec_list_push_tail(&phi->srcs, &src->node);</div><div>> }</div><div>> }</div><div>> </div><div>> /* Performs variable renaming by doing a DFS of the dominance tree</div><div>> *</div><div>> * This algorithm is very similar to the one outlined in "Efficiently</div><div>> * Computing Static Single Assignment Form and the Control Dependence</div><div>> * Graph" by Cytron et. al. The primary difference is that we only put one</div><div>> * SSA def on the stack per block.</div><div>> */</div><div>> static bool</div><div>> rename_variables_block(nir_block *block, struct lower_variables_state *state)</div><div>> {</div><div>> nir_foreach_instr_safe(block, instr) {</div><span class=""><div>> if (instr->type == nir_instr_type_phi) {</div></span><div>> nir_phi_instr *phi = nir_instr_as_phi(instr);</div><div>> </div><div>> struct hash_entry *entry =</div><div>> _mesa_hash_table_search(state->phi_table,</div><div>> _mesa_hash_pointer(phi), phi);</div><span class=""><div>> </div><div>> /* This can happen if we already have phi nodes in the program</div></span><span class=""><div>> * that were not created in this pass.</div></span><div>> */</div><div>> if (!entry)</div><div>> continue;</div><span class=""><div>> </div><div>> struct deref_node *node = entry->data;</div><div>> </div></span><div>> def_stack_push(node, &phi->dest.ssa, state);</div><span class=""><div>> } else if (instr->type == nir_instr_type_intrinsic) {</div></span><div>> nir_intrinsic_instr *intrin = nir_instr_as_intrinsic(instr);</div><div>> </div><div>> switch (intrin->intrinsic) {</div><div>> case nir_intrinsic_load_var: {</div><span class=""><div>> struct deref_node *node = get_deref_node(intrin->variables[0],</div></span><div>> false, state);</div><div>> </div><div>> if (node == NULL) {</div><span class=""><div>> /* If we hit this path then we are referencing an invalid</div></span><span class=""><div>> * value. Most likely, we unrolled something and are</div></span><span class=""><div>> * reading past the end of some array. In any case, this</div></span><span class=""><div>> * should result in an undefined value.</div></span><div>> */</div><div>> nir_ssa_undef_instr *undef =</div><div>> nir_ssa_undef_instr_create(state->mem_ctx,</div><div>> intrin->num_components);</div><div>> </div><div>> nir_instr_insert_before(&intrin->instr, &undef->instr);</div><div>> nir_instr_remove(&intrin->instr);</div><div>> </div><div>> nir_src new_src = {</div><div>> .is_ssa = true,</div><div>> .ssa = &undef->def,</div><div>> };</div><div>> </div><div>> nir_ssa_def_rewrite_uses(&intrin->dest.ssa, new_src,</div><div>> state->mem_ctx);</div><div>> continue;</div><div>> }</div><div>> </div><div>> if (!node->lower_to_ssa)</div><div>> continue;</div><span class=""><div>> </div><div>> nir_alu_instr *mov = nir_alu_instr_create(state->mem_ctx,</div></span><div>> nir_op_imov);</div><span class=""><div>> mov->src[0].src.is_ssa = true;</div></span><span class=""><div>> mov->src[0].src.ssa = get_ssa_def_for_block(node, block, state);</div></span><div>> for (unsigned i = intrin->num_components; i < 4; i++)</div><span class=""><div>> mov->src[0].swizzle[i] = 0;</div><div>> </div></span><div>> assert(intrin->dest.is_ssa);</div><div>> </div><div>> mov->dest.write_mask = (1 << intrin->num_components) - 1;</div><div>> mov->dest.dest.is_ssa = true;</div><div>> nir_ssa_def_init(&mov->instr, &mov->dest.dest.ssa,</div><div>> intrin->num_components, NULL);</div><div>> </div><div>> nir_instr_insert_before(&intrin->instr, &mov->instr);</div><div>> nir_instr_remove(&intrin->instr);</div><div>> </div><div>> nir_src new_src = {</div><div>> .is_ssa = true,</div><div>> .ssa = &mov->dest.dest.ssa,</div><div>> };</div><div>> </div><div>> nir_ssa_def_rewrite_uses(&intrin->dest.ssa, new_src,</div><div>> state->mem_ctx);</div><div>> break;</div><div>> }</div><div>> </div><div>> case nir_intrinsic_store_var: {</div><span class=""><div>> struct deref_node *node = get_deref_node(intrin->variables[0],</div></span><div>> false, state);</div><div>> </div><div>> if (node == NULL) {</div><span class=""><div>> /* Probably an out-of-bounds array store. That should be a</div></span><div>> * no-op. */</div><div>> nir_instr_remove(&intrin->instr);</div><div>> continue;</div><div>> }</div><div>> </div><div>> if (!node->lower_to_ssa)</div><div>> continue;</div><div>> </div><div>> assert(intrin->num_components ==</div><div>> glsl_get_vector_elements(node->type));</div><div>> </div><div>> assert(intrin->src[0].is_ssa);</div><span class=""><div>> </div><div>> nir_alu_instr *mov = nir_alu_instr_create(state->mem_ctx,</div></span><div>> nir_op_imov);</div><span class=""><div>> mov->src[0].src.is_ssa = true;</div></span><span class=""><div>> mov->src[0].src.ssa = intrin->src[0].ssa;</div></span><div>> for (unsigned i = intrin->num_components; i < 4; i++)</div><span class=""><div>> mov->src[0].swizzle[i] = 0;</div><div>> </div></span><div>> mov->dest.write_mask = (1 << intrin->num_components) - 1;</div><div>> mov->dest.dest.is_ssa = true;</div><div>> nir_ssa_def_init(&mov->instr, &mov->dest.dest.ssa,</div><div>> intrin->num_components, NULL);</div><div>> </div><div>> nir_instr_insert_before(&intrin->instr, &mov->instr);</div><div>> </div><div>> def_stack_push(node, &mov->dest.dest.ssa, state);</div><div>> </div><div>> /* We'll wait to remove the unstruction until the next pass</div><div><br></div><div>instruction (although I'm kind of curious about what an unstruction would be...)</div></div></blockquote><div><br></div><div>lol<br></div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div><br></div><div>> * where we pop the node we just pushed back off the stack.</div><div>> */</div><div>> break;</div><div>> }</div><div>> </div><div>> default:</div><div>> break;</div><span class=""><div>> }</div><div>> }</div><div>> }</div><div>> </div><div>> if (block->successors[0])</div></span><span class=""><div>> add_phi_sources(block->successors[0], block, state);</div></span><span class=""><div>> if (block->successors[1])</div></span><span class=""><div>> add_phi_sources(block->successors[1], block, state);</div><div>> </div></span><div>> for (unsigned i = 0; i < block->num_dom_children; ++i)</div><div>> rename_variables_block(block->dom_children[i], state);</div><div>> </div><div>> /* Now we iterate over the instructions and pop off any SSA defs that we</div><div>> * pushed in the first loop.</div><div>> */</div><div>> nir_foreach_instr_safe(block, instr) {</div><span class=""><div>> if (instr->type == nir_instr_type_phi) {</div></span><div>> nir_phi_instr *phi = nir_instr_as_phi(instr);</div><div>> </div><div>> struct hash_entry *entry =</div><div>> _mesa_hash_table_search(state->phi_table,</div><div>> _mesa_hash_pointer(phi), phi);</div><span class=""><div>> </div><div>> /* This can happen if we already have phi nodes in the program</div></span><span class=""><div>> * that were not created in this pass.</div></span><div>> */</div><div>> if (!entry)</div><div>> continue;</div><span class=""><div>> </div><div>> struct deref_node *node = entry->data;</div><div>> </div></span><div>> def_stack_pop_if_in_block(node, block);</div><span class=""><div>> } else if (instr->type == nir_instr_type_intrinsic) {</div></span><div>> nir_intrinsic_instr *intrin = nir_instr_as_intrinsic(instr);</div><div>> </div><div>> if (intrin->intrinsic != nir_intrinsic_store_var)</div><div>> continue;</div><span class=""><div>> </div><div>> struct deref_node *node = get_deref_node(intrin->variables[0],</div></span><div>> false, state);</div><div>> if (!node)</div><div>> continue;</div><div>> </div><div>> if (!node->lower_to_ssa)</div><div>> continue;</div><div>> </div><div>> def_stack_pop_if_in_block(node, block);</div><div>> nir_instr_remove(&intrin->instr);</div><div>> }</div><div>> }</div><div>> </div><div>> return true;</div><div>> }</div><div>> </div><div>> /* Inserts phi nodes for all variables marked lower_to_ssa</div><div>> *</div><div>> * This is the same algorithm as presented in "Efficiently Computing Static</div><div>> * Single Assignment Form and the Control Dependence Graph" by Cytron et.</div><div>> * al.</div><div>> */</div><div>> static void</div><div>> insert_phi_nodes(struct lower_variables_state *state)</div><div>> {</div><span class=""><div>> unsigned work[state->impl->num_blocks];</div></span><span class=""><div>> unsigned has_already[state->impl->num_blocks];</div></span><span class=""><div>> nir_block *W[state->impl->num_blocks];</div><div><br></div></span><div>In the original code I had a comment that explained why W can be an array here (since it was a set in the original paper). Can we add it back?</div></div></blockquote><div><br></div><div>Yeah, we should do that.<br></div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div><br></div><div>> </div><div>> memset(work, 0, sizeof work);</div><div>> memset(has_already, 0, sizeof has_already);</div><div>> </div><div>> unsigned w_start, w_end;</div><div>> unsigned iter_count = 0;</div><div>> </div><div>> struct hash_entry *deref_entry;</div><div>> hash_table_foreach(state->deref_leaves, deref_entry) {</div><span class=""><div>> struct deref_node *node = deref_entry->data;</div><div>> </div></span><span class=""><div>> if (node->stores == NULL)</div></span><div>> continue;</div><div>> </div><div>> if (!node->lower_to_ssa)</div><div>> continue;</div><span class=""><div>> </div><div>> w_start = w_end = 0;</div></span><div>> iter_count++;</div><div>> </div><div>> struct set_entry *store_entry;</div><div>> set_foreach(node->stores, store_entry) {</div><span class=""><div>> nir_intrinsic_instr *store = (nir_intrinsic_instr *)store_entry->key;</div></span><span class=""><div>> if (work[store->instr.block->index] < iter_count)</div></span><span class=""><div>> W[w_end++] = store->instr.block;</div></span><span class=""><div>> work[store->instr.block->index] = iter_count;</div><div>> }</div><div>> </div></span><div>> while (w_start != w_end) {</div><span class=""><div>> nir_block *cur = W[w_start++];</div></span><div>> struct set_entry *dom_entry;</div><div>> set_foreach(cur->dom_frontier, dom_entry) {</div><span class=""><div>> nir_block *next = (nir_block *) dom_entry->key;</div><div>> </div></span><div>> /*</div><span class=""><div>> * If there's more than one return statement, then the end block</div></span><span class=""><div>> * can be a join point for some definitions. However, there are</div></span><span class=""><div>> * no instructions in the end block, so nothing would use those</div></span><span class=""><div>> * phi nodes. Of course, we couldn't place those phi nodes</div></span><span class=""><div>> * anyways due to the restriction of having no instructions in the</div></span><div>> * end block...</div><div>> */</div><span class=""><div>> if (next == state->impl->end_block)</div></span><div>> continue;</div><span class=""><div>> </div><div>> if (has_already[next->index] < iter_count) {</div></span><span class=""><div>> nir_phi_instr *phi = nir_phi_instr_create(state->mem_ctx);</div></span><div>> phi->dest.is_ssa = true;</div><div>> nir_ssa_def_init(&phi->instr, &phi->dest.ssa,</div><div>> glsl_get_vector_elements(node->type), NULL);</div><div>> nir_instr_insert_before_block(next, &phi->instr);</div><div>> </div><div>> _mesa_hash_table_insert(state->phi_table,</div><div>> _mesa_hash_pointer(phi), phi, node);</div><span class=""><div>> </div><div>> has_already[next->index] = iter_count;</div></span><span class=""><div>> if (work[next->index] < iter_count) {</div></span><span class=""><div>> work[next->index] = iter_count;</div></span><div>> W[w_end++] = next;</div><div>> }</div><div>> }</div><div>> }</div><div>> }</div><div>> }</div><div>> }</div><div>> </div><div>> static bool</div><div>> nir_lower_variables_impl(nir_function_impl *impl)</div><div>> {</div><div>> struct lower_variables_state state;</div><div>> </div><div>> state.mem_ctx = ralloc_parent(impl);</div><div>> state.dead_ctx = ralloc_context(state.mem_ctx);</div><div>> state.impl = impl;</div><div>> </div><div>> state.deref_var_nodes = _mesa_hash_table_create(state.dead_ctx,</div><div>> _mesa_key_pointer_equal);</div><div>> state.deref_leaves = _mesa_hash_table_create(state.dead_ctx,</div><div>> derefs_equal);</div><div>> state.phi_table = _mesa_hash_table_create(state.dead_ctx,</div><div>> _mesa_key_pointer_equal);</div><div>> </div><div>> nir_foreach_block(impl, register_variable_uses_block, &state);</div><span class=""><div>> </div><div>> struct set *outputs = _mesa_set_create(state.dead_ctx,</div></span><div>> _mesa_key_pointer_equal);</div><div>> </div><div>> bool progress = false;</div><div>> </div><div>> nir_metadata_require(impl, nir_metadata_block_index);</div><div>> </div><div>> struct hash_entry *entry;</div><div>> hash_table_foreach(state.deref_leaves, entry) {</div><span class=""><div>> nir_deref_var *deref = (void *)entry->key;</div></span><span class=""><div>> struct deref_node *node = entry->data;</div><div>> </div></span><span class=""><div>> if (deref->var->data.mode != nir_var_local) {</div></span><div>> _mesa_hash_table_remove(state.deref_leaves, entry);</div><div>> continue;</div><div>> }</div><div>> </div><div>> if (deref_may_be_aliased(deref, &state)) {</div><div>> _mesa_hash_table_remove(state.deref_leaves, entry);</div><div>> continue;</div><div>> }</div><div>> </div><div>> node->lower_to_ssa = true;</div><div>> progress = true;</div><div>> </div><div>> if (deref->var->constant_initializer) {</div><span class=""><div>> nir_load_const_instr *load = get_const_initializer_load(deref, &state);</div></span><div>> nir_ssa_def_init(&load->instr, &load->def,</div><div>> glsl_get_vector_elements(node->type), NULL);</div><div>> nir_instr_insert_before_cf_list(&impl->body, &load->instr);</div><div>> def_stack_push(node, &load->def, &state);</div><span class=""><div>> }</div><div>> </div><div>> if (deref->var->data.mode == nir_var_shader_out)</div></span><div>> _mesa_set_add(outputs, _mesa_hash_pointer(node), node);</div><div>> </div><div>> foreach_deref_node_match(deref, lower_copies_to_load_store, &state);</div><div>> }</div><div>> </div><div>> if (!progress)</div><div>> return false;</div><div>> </div><div>> nir_metadata_require(impl, nir_metadata_dominance);</div><div>> </div><div>> insert_phi_nodes(&state);</div><div>> rename_variables_block(impl->start_block, &state);</div><div>> </div><div>> nir_metadata_preserve(impl, nir_metadata_block_index |</div><div>> nir_metadata_dominance);</div><div>> </div><div>> ralloc_free(state.dead_ctx);</div><div>> </div><div>> return progress;</div><div>> }</div><div>> </div><div>> void</div><div>> nir_lower_variables(nir_shader *shader)</div><div>> {</div><div>> nir_foreach_overload(shader, overload) {</div><div>> if (overload->impl)</div><div>> nir_lower_variables_impl(overload->impl);</div><div>> }</div><div>> }</div></div>
</blockquote></div><br></div></div>