<div dir="ltr">Glenn,<br>Thanks for taking the time to look over things.<br><div class="gmail_extra"><br><div class="gmail_quote">On Fri, Dec 19, 2014 at 1:24 PM, Glenn Kennard <span dir="ltr"><<a href="mailto:glenn.kennard@gmail.com" target="_blank">glenn.kennard@gmail.com</a>></span> wrote:<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div><div>On Tue, 16 Dec 2014 07:04:14 +0100, Jason Ekstrand <<a href="mailto:jason@jlekstrand.net" target="_blank">jason@jlekstrand.net</a>> wrote:<br>
<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
From: Connor Abbott <<a href="mailto:connor.abbott@intel.com" target="_blank">connor.abbott@intel.com</a>><br>
<br>
This includes all the instructions, ifs, loops, functions, etc. This is<br>
similar to the information in ir.h.<br>
<br>
v2: Jason Ekstrand <<a href="mailto:jason.ekstrand@intel.com" target="_blank">jason.ekstrand@intel.com</a>>:<br>
Include ralloc and hash_table from the util directory<br>
---<br>
src/glsl/Makefile.sources | 2 +<br>
src/glsl/nir/nir.h | 1150 ++++++++++++++++++++++++++++++<u></u>+++++++++++<br>
src/glsl/nir/nir_intrinsics.c | 49 ++<br>
src/glsl/nir/nir_intrinsics.h | 158 ++++++<br>
src/glsl/nir/nir_opcodes.c | 46 ++<br>
src/glsl/nir/nir_opcodes.h | 346 +++++++++++++<br>
6 files changed, 1751 insertions(+)<br>
create mode 100644 src/glsl/nir/nir.h<br>
create mode 100644 src/glsl/nir/nir_intrinsics.c<br>
create mode 100644 src/glsl/nir/nir_intrinsics.h<br>
create mode 100644 src/glsl/nir/nir_opcodes.c<br>
create mode 100644 src/glsl/nir/nir_opcodes.h<br>
<br>
diff --git a/src/glsl/Makefile.sources b/src/glsl/Makefile.sources<br>
index c3a90f7..e8eedd1 100644<br>
--- a/src/glsl/Makefile.sources<br>
+++ b/src/glsl/Makefile.sources<br>
@@ -14,6 +14,8 @@ LIBGLCPP_GENERATED_FILES = \<br>
$(GLSL_BUILDDIR)/glcpp/glcpp-<u></u>parse.c<br>
NIR_FILES = \<br>
+ $(GLSL_SRCDIR)/nir/nir_<u></u>intrinsics.c \<br>
+ $(GLSL_SRCDIR)/nir/nir_<u></u>opcodes.c \<br>
$(GLSL_SRCDIR)/nir/nir_types.<u></u>cpp<br>
# libglsl<br>
diff --git a/src/glsl/nir/nir.h b/src/glsl/nir/nir.h<br>
new file mode 100644<br>
index 0000000..ef486da<br>
--- /dev/null<br>
+++ b/src/glsl/nir/nir.h<br>
@@ -0,0 +1,1150 @@<br>
+/*<br>
+ * Copyright © 2014 Connor Abbott<br>
+ *<br>
+ * Permission is hereby granted, free of charge, to any person obtaining a<br>
+ * copy of this software and associated documentation files (the "Software"),<br>
+ * to deal in the Software without restriction, including without limitation<br>
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,<br>
+ * and/or sell copies of the Software, and to permit persons to whom the<br>
+ * Software is furnished to do so, subject to the following conditions:<br>
+ *<br>
+ * The above copyright notice and this permission notice (including the next<br>
+ * paragraph) shall be included in all copies or substantial portions of the<br>
+ * Software.<br>
+ *<br>
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR<br>
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,<br>
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL<br>
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER<br>
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING<br>
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS<br>
+ * IN THE SOFTWARE.<br>
+ *<br>
+ * Authors:<br>
+ * Connor Abbott (<a href="mailto:cwabbott0@gmail.com" target="_blank">cwabbott0@gmail.com</a>)<br>
+ *<br>
+ */<br>
+<br>
+#pragma once<br>
+<br>
+#include "util/hash_table.h"<br>
+#include "main/set.h"<br>
+#include "../list.h"<br>
+#include "GL/gl.h" /* GLenum */<br>
+#include "util/ralloc.h"<br>
+#include "nir_types.h"<br>
+#include <stdio.h><br>
+<br>
+#ifdef __cplusplus<br>
+extern "C" {<br>
+#endif<br>
+<br>
+struct nir_function_overload;<br>
+struct nir_function;<br>
+<br>
+<br>
+/**<br>
+ * Description of built-in state associated with a uniform<br>
+ *<br>
+ * \sa nir_variable::state_slots<br>
+ */<br>
+typedef struct {<br>
+ int tokens[5];<br>
+ int swizzle;<br>
+} nir_state_slot;<br>
+<br>
+typedef enum {<br>
+ nir_var_shader_in,<br>
+ nir_var_shader_out,<br>
+ nir_var_global,<br>
+ nir_var_local,<br>
+ nir_var_uniform,<br>
+ nir_var_system_value<br>
+} nir_variable_mode;<br>
+<br>
+/**<br>
+ * Data stored in an nir_constant<br>
+ */<br>
+union nir_constant_data {<br>
+ unsigned u[16];<br>
+ int i[16];<br>
+ float f[16];<br>
+ bool b[16];<br>
+};<br>
+<br>
+typedef struct nir_constant {<br>
+ /**<br>
+ * Value of the constant.<br>
+ *<br>
+ * The field used to back the values supplied by the constant is determined<br>
+ * by the type associated with the \c ir_instruction. Constants may be<br>
+ * scalars, vectors, or matrices.<br>
+ */<br>
+ union nir_constant_data value;<br>
+<br>
+ /* Array elements / Structure Fields */<br>
+ struct nir_constant **elements;<br>
+} nir_constant;<br>
+<br>
+/**<br>
+ * \brief Layout qualifiers for gl_FragDepth.<br>
+ *<br>
+ * The AMD/ARB_conservative_depth extensions allow gl_FragDepth to be redeclared<br>
+ * with a layout qualifier.<br>
+ */<br>
+typedef enum {<br>
+ nir_depth_layout_none, /**< No depth layout is specified. */<br>
+ nir_depth_layout_any,<br>
+ nir_depth_layout_greater,<br>
+ nir_depth_layout_less,<br>
+ nir_depth_layout_unchanged<br>
+} nir_depth_layout;<br>
+<br>
+/**<br>
+ * Either a uniform, global variable, shader input, or shader output. Based on<br>
+ * ir_variable - it should be easy to translate between the two.<br>
+ */<br>
+<br>
+typedef struct {<br>
+ struct exec_node node;<br>
+<br>
+ /**<br>
+ * Declared type of the variable<br>
+ */<br>
+ const struct glsl_type *type;<br>
+<br>
+ /**<br>
+ * Declared name of the variable<br>
+ */<br>
+ char *name;<br>
+<br>
+ /**<br>
+ * For variables which satisfy the is_interface_instance() predicate, this<br>
+ * points to an array of integers such that if the ith member of the<br>
+ * interface block is an array, max_ifc_array_access[i] is the maximum<br>
+ * array element of that member that has been accessed. If the ith member<br>
+ * of the interface block is not an array, max_ifc_array_access[i] is<br>
+ * unused.<br>
+ *<br>
+ * For variables whose type is not an interface block, this pointer is<br>
+ * NULL.<br>
+ */<br>
+ unsigned *max_ifc_array_access;<br>
+<br>
+ struct nir_variable_data {<br>
+<br>
+ /**<br>
+ * Is the variable read-only?<br>
+ *<br>
+ * This is set for variables declared as \c const, shader inputs,<br>
+ * and uniforms.<br>
+ */<br>
+ unsigned read_only:1;<br>
+ unsigned centroid:1;<br>
+ unsigned sample:1;<br>
+ unsigned invariant:1;<br>
+<br>
+ /**<br>
+ * Storage class of the variable.<br>
+ *<br>
+ * \sa nir_variable_mode<br>
+ */<br>
+ unsigned mode:4;<br>
+<br>
+ /**<br>
+ * Interpolation mode for shader inputs / outputs<br>
+ *<br>
+ * \sa ir_variable_interpolation<br>
+ */<br>
+ unsigned interpolation:2;<br>
+<br>
+ /**<br>
+ * \name ARB_fragment_coord_conventions<br>
+ * @{<br>
+ */<br>
+ unsigned origin_upper_left:1;<br>
+ unsigned pixel_center_integer:1;<br>
+ /*@}*/<br>
+<br>
+ /**<br>
+ * Was the location explicitly set in the shader?<br>
+ *<br>
+ * If the location is explicitly set in the shader, it \b cannot be changed<br>
+ * by the linker or by the API (e.g., calls to \c glBindAttribLocation have<br>
+ * no effect).<br>
+ */<br>
+ unsigned explicit_location:1;<br>
+ unsigned explicit_index:1;<br>
+<br>
+ /**<br>
+ * Was an initial binding explicitly set in the shader?<br>
+ *<br>
+ * If so, constant_value contains an integer ir_constant representing the<br>
+ * initial binding point.<br>
+ */<br>
+ unsigned explicit_binding:1;<br>
+<br>
+ /**<br>
+ * Does this variable have an initializer?<br>
+ *<br>
+ * This is used by the linker to cross-validiate initializers of global<br>
+ * variables.<br>
+ */<br>
+ unsigned has_initializer:1;<br>
+<br>
+ /**<br>
+ * Is this variable a generic output or input that has not yet been matched<br>
+ * up to a variable in another stage of the pipeline?<br>
+ *<br>
+ * This is used by the linker as scratch storage while assigning locations<br>
+ * to generic inputs and outputs.<br>
+ */<br>
+ unsigned is_unmatched_generic_inout:1;<br>
+<br>
+ /**<br>
+ * If non-zero, then this variable may be packed along with other variables<br>
+ * into a single varying slot, so this offset should be applied when<br>
+ * accessing components. For example, an offset of 1 means that the x<br>
+ * component of this variable is actually stored in component y of the<br>
+ * location specified by \c location.<br>
+ */<br>
+ unsigned location_frac:2;<br>
+<br>
+ /**<br>
+ * Non-zero if this variable was created by lowering a named interface<br>
+ * block which was not an array.<br>
+ *<br>
+ * Note that this variable and \c from_named_ifc_block_array will never<br>
+ * both be non-zero.<br>
+ */<br>
+ unsigned from_named_ifc_block_nonarray:<u></u>1;<br>
+<br>
+ /**<br>
+ * Non-zero if this variable was created by lowering a named interface<br>
+ * block which was an array.<br>
+ *<br>
+ * Note that this variable and \c from_named_ifc_block_nonarray will never<br>
+ * both be non-zero.<br>
+ */<br>
+ unsigned from_named_ifc_block_array:1;<br>
+<br>
+ /**<br>
+ * \brief Layout qualifier for gl_FragDepth.<br>
+ *<br>
+ * This is not equal to \c ir_depth_layout_none if and only if this<br>
+ * variable is \c gl_FragDepth and a layout qualifier is specified.<br>
+ */<br>
+ nir_depth_layout depth_layout;<br>
+<br>
+ /**<br>
+ * Storage location of the base of this variable<br>
+ *<br>
+ * The precise meaning of this field depends on the nature of the variable.<br>
+ *<br>
+ * - Vertex shader input: one of the values from \c gl_vert_attrib.<br>
+ * - Vertex shader output: one of the values from \c gl_varying_slot.<br>
+ * - Geometry shader input: one of the values from \c gl_varying_slot.<br>
+ * - Geometry shader output: one of the values from \c gl_varying_slot.<br>
+ * - Fragment shader input: one of the values from \c gl_varying_slot.<br>
+ * - Fragment shader output: one of the values from \c gl_frag_result.<br>
+ * - Uniforms: Per-stage uniform slot number for default uniform block.<br>
+ * - Uniforms: Index within the uniform block definition for UBO members.<br>
+ * - Other: This field is not currently used.<br>
+ *<br>
+ * If the variable is a uniform, shader input, or shader output, and the<br>
+ * slot has not been assigned, the value will be -1.<br>
+ */<br>
+ int location;<br>
+<br>
+ /**<br>
+ * The actual location of the variable in the IR. Only valid for inputs<br>
+ * and outputs.<br>
+ */<br>
+ unsigned int driver_location;<br>
+<br>
+ /**<br>
+ * output index for dual source blending.<br>
+ */<br>
+ int index;<br>
</blockquote>
<br></div></div>
src index is only ever 0 or 1. I'll note its already a bitfield in src/glsl/ir.h from where this is copied, along with some other space optimizations, might want to copy those over as well.<br><div><div>
<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
+<br>
+ /**<br>
+ * Initial binding point for a sampler or UBO.<br>
+ *<br>
+ * For array types, this represents the binding point for the first element.<br>
+ */<br>
+ int binding;<br>
+<br>
+ /**<br>
+ * Location an atomic counter is stored at.<br>
+ */<br>
+ struct {<br>
+ unsigned buffer_index;<br>
+ unsigned offset;<br>
+ } atomic;<br>
+<br>
+ /**<br>
+ * ARB_shader_image_load_store qualifiers.<br>
+ */<br>
+ struct {<br>
+ bool read_only; /**< "readonly" qualifier. */<br>
+ bool write_only; /**< "writeonly" qualifier. */<br>
+ bool coherent;<br>
+ bool _volatile;<br>
+ bool restrict_flag;<br>
+<br>
+ /** Image internal format if specified explicitly, otherwise GL_NONE. */<br>
+ GLenum format;<br>
+ } image;<br>
</blockquote>
<br></div></div>
Similar to dual source index, image struct has diverged a bit from src/glsl/ir.h.<br><div><div>
<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
+<br>
+ /**<br>
+ * Highest element accessed with a constant expression array index<br>
+ *<br>
+ * Not used for non-array variables.<br>
+ */<br>
+ unsigned max_array_access;<br>
+<br>
+ } data;<br>
+<br>
+ /**<br>
+ * Built-in state that backs this uniform<br>
+ *<br>
+ * Once set at variable creation, \c state_slots must remain invariant.<br>
+ * This is because, ideally, this array would be shared by all clones of<br>
+ * this variable in the IR tree. In other words, we'd really like for it<br>
+ * to be a fly-weight.<br>
+ *<br>
+ * If the variable is not a uniform, \c num_state_slots will be zero and<br>
+ * \c state_slots will be \c NULL.<br>
+ */<br>
+ /*@{*/<br>
+ unsigned num_state_slots; /**< Number of state slots used */<br>
+ nir_state_slot *state_slots; /**< State descriptors. */<br>
+ /*@}*/<br>
+<br>
+ /**<br>
+ * Value assigned in the initializer of a variable declared "const"<br>
+ */<br>
+ nir_constant *constant_value;<br>
+<br>
+ /**<br>
+ * Constant expression assigned in the initializer of the variable<br>
+ *<br>
+ * \warning<br>
+ * This field and \c ::constant_value are distinct. Even if the two fields<br>
+ * refer to constants with the same value, they must point to separate<br>
+ * objects.<br>
+ */<br>
+ nir_constant *constant_initializer;<br>
+<br>
+ /**<br>
+ * For variables that are in an interface block or are an instance of an<br>
+ * interface block, this is the \c GLSL_TYPE_INTERFACE type for that block.<br>
+ *<br>
+ * \sa ir_variable::location<br>
+ */<br>
+ const struct glsl_type *interface_type;<br>
+} nir_variable;<br>
+<br>
+typedef struct {<br>
+ struct exec_node node;<br>
+<br>
+ unsigned num_components; /** < number of vector components */<br>
+ unsigned num_array_elems; /** < size of array (0 for no array) */<br>
+<br>
+ /** for liveness analysis, the index in the bit-array of live variables */<br>
+ unsigned index;<br>
+<br>
+ /** only for debug purposes, can be NULL */<br>
+ const char *name;<br>
+<br>
+ /** whether this register is local (per-function) or global (per-shader) */<br>
+ bool is_global;<br>
+<br>
+ /**<br>
+ * If this flag is set to true, then accessing channels >= num_components<br>
+ * is well-defined, and simply spills over to the next array element. This<br>
+ * is useful for backends that can do per-component accessing, in<br>
+ * particular scalar backends. By setting this flag and making<br>
+ * num_components equal to 1, structures can be packed tightly into<br>
+ * registers and then registers can be accessed per-component to get to<br>
+ * each structure member, even if it crosses vec4 boundaries.<br>
+ */<br>
+ bool is_packed;<br>
+<br>
+ /** set of nir_instr's where this register is used (read from) */<br>
+ struct set *uses;<br>
+<br>
+ /** set of nir_instr's where this register is defined (written to) */<br>
+ struct set *defs;<br>
+<br>
+ /** set of ifs where this register is used as a condition */<br>
+ struct set *if_uses;<br>
+} nir_register;<br>
+<br>
+typedef enum {<br>
+ nir_instr_type_alu,<br>
+ nir_instr_type_call,<br>
+ nir_instr_type_texture,<br>
+ nir_instr_type_intrinsic,<br>
+ nir_instr_type_load_const,<br>
+ nir_instr_type_jump,<br>
+ nir_instr_type_ssa_undef,<br>
+ nir_instr_type_phi,<br>
+} nir_instr_type;<br>
+<br>
+typedef struct {<br>
+ struct exec_node node;<br>
+ nir_instr_type type;<br>
+ struct nir_block *block;<br>
+} nir_instr;<br>
+<br>
+#define nir_instr_next(instr) \<br>
+ exec_node_data(nir_instr, (instr)->node.next, node)<br>
+<br>
+#define nir_instr_prev(instr) \<br>
+ exec_node_data(nir_instr, (instr)->node.prev, node)<br>
</blockquote>
<br></div></div>
Static inlines similar what is done for nir_deref_as_var might make walking the structures in gdb a little easier.<span><br>
<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
+<br>
+typedef struct {<br>
+ /** for debugging only, can be NULL */<br>
+ const char* name;<br>
+<br>
+ /** index into the bit-array for liveness analysis */<br>
+ unsigned index;<br>
+<br>
+ nir_instr *parent_instr;<br>
+<br>
+ struct set *uses;<br>
+ struct set *if_uses;<br>
</blockquote>
<br></span>
Comment what type the sets are, similar to whats already done for nir_register.uses.<br></blockquote><div><br></div><div>Sure, that's a good plan.<br></div><div> <br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
I think using sets for these is probably overkill, a simple array or linked list is likely better performing for most cases. A number or two to motivate the extra complexity perhaps?<span><br></span></blockquote><div><br></div><div>Being able to quickly add/remove elements is extremely important. Just ask the noveau people about this. Why not a linked list? Part of the problem is that we have a many-to-many mapping. Multiple instructions can use the same source and a single instruction can use multiple sources. I did think about changing to a linked list with the link embedded in the source. However, this would require adding 3 pointers to every source (prev, next, and a pointer back to the instruction) in order be able to do it efficiently. Also, if you want to iterate over all of the instructions that use a source, you have to keep track of what instructions you've seen because the same instruction can use a source multiple times. In other words, It was deemed not practical.<br></div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span>
<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
+<br>
+ uint8_t num_components;<br>
+} nir_ssa_def;<br>
+<br>
+struct nir_src;<br>
+<br>
+typedef struct {<br>
+ nir_register *reg;<br>
+ struct nir_src *indirect; /** < NULL for no indirect offset */<br>
+ unsigned base_offset;<br>
+<br>
+ /* TODO use-def chain goes here */<br>
</blockquote>
<br></span>
Stale comment? Same for nir_reg_dest<div><div><br>
<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
+} nir_reg_src;<br>
+<br>
+typedef struct {<br>
+ nir_register *reg;<br>
+ struct nir_src *indirect; /** < NULL for no indirect offset */<br>
+ unsigned base_offset;<br>
+<br>
+ /* TODO def-use chain goes here */<br>
+} nir_reg_dest;<br>
+<br>
+typedef struct nir_src {<br>
+ union {<br>
+ nir_reg_src reg;<br>
+ nir_ssa_def *ssa;<br>
+ };<br>
+<br>
+ bool is_ssa;<br>
+} nir_src;<br>
+<br>
+typedef struct {<br>
+ union {<br>
+ nir_reg_dest reg;<br>
+ nir_ssa_def ssa;<br>
+ };<br>
+<br>
+ bool is_ssa;<br>
+} nir_dest;<br>
+<br>
+nir_src nir_src_copy(nir_src src, void *mem_ctx);<br>
+nir_dest nir_dest_copy(nir_dest dest, void *mem_ctx);<br>
+<br>
+typedef struct {<br>
+ nir_src src;<br>
+<br>
+ /**<br>
+ * \name input modifiers<br>
+ */<br>
+ /*@{*/<br>
+ /**<br>
+ * For inputs interpreted as a floating point, flips the sign bit. For inputs<br>
+ * interpreted as an integer, performs the two's complement negation.<br>
+ */<br>
+ bool negate;<br>
+<br>
+ /**<br>
+ * Clears the sign bit for floating point values, and computes the integer<br>
+ * absolute value for integers. Note that the negate modifier acts after<br>
+ * the absolute value modifier, therefore if both are set then all inputs<br>
+ * will become negative.<br>
+ */<br>
+ bool abs;<br>
+ /*@}*/<br>
+<br>
+ /**<br>
+ * For each input component, says which component of the register it is<br>
+ * chosen from. Note that which elements of the swizzle are used and which<br>
+ * are ignored are based on the write mask for most opcodes - for example,<br>
+ * a statement like "foo.xzw = bar.zyx" would have a writemask of 1101b and<br>
+ * a swizzle of {2, x, 1, 0} where x means "don't care."<br>
+ */<br>
+ uint8_t swizzle[4];<br>
+} nir_alu_src;<br>
+<br>
+typedef struct {<br>
+ nir_dest dest;<br>
+<br>
+ /**<br>
+ * \name saturate output modifier<br>
+ *<br>
+ * Only valid for opcodes that output floating-point numbers. Clamps the<br>
+ * output to between 0.0 and 1.0 inclusive.<br>
+ */<br>
+<br>
+ bool saturate;<br>
+<br>
+ unsigned write_mask : 4; /* ignored if dest.is_ssa is true */<br>
+} nir_alu_dest;<br>
+<br>
+#define OPCODE(name, num_inputs, per_component, output_size, output_type, \<br>
+ input_sizes, input_types) \<br>
+ nir_op_##name,<br>
+<br>
+#define LAST_OPCODE(name) nir_last_opcode = nir_op_##name,<br>
+<br>
+typedef enum {<br>
+#include "nir_opcodes.h"<br>
+ nir_num_opcodes = nir_last_opcode + 1<br>
+} nir_op;<br>
+<br>
+#undef OPCODE<br>
+#undef LAST_OPCODE<br>
+<br>
+typedef enum {<br>
+ nir_type_float,<br>
+ nir_type_int,<br>
+ nir_type_unsigned,<br>
+ nir_type_bool<br>
+} nir_alu_type;<br>
+<br>
+typedef struct {<br>
+ const char *name;<br>
+<br>
+ unsigned num_inputs;<br>
+<br>
+ /**<br>
+ * If true, the opcode acts in the standard, per-component manner; the<br>
+ * operation is performed on each component (except the ones that are masked<br>
+ * out) with the input being taken from the input swizzle for that component.<br>
+ *<br>
+ * If false, the size of the output and inputs are explicitly given; swizzle<br>
+ * and writemask are still in effect, but if the output component is masked<br>
+ * out, then the input component may still be in use.<br>
+ *<br>
+ * The size of some of the inputs may be given (i.e. non-zero) even though<br>
+ * per_component is false; in that case, each component of the input acts<br>
+ * per-component, while the rest of the inputs and the output are normal.<br>
+ * For example, for conditional select the condition is per-component but<br>
+ * everything else is normal.<br>
+ */<br>
+ bool per_component;<br>
+<br>
+ /**<br>
+ * If per_component is false, the number of components in the output.<br>
+ */<br>
+ unsigned output_size;<br>
+<br>
+ /**<br>
+ * The type of vector that the instruction outputs. Note that this<br>
+ * determines whether the saturate modifier is allowed.<br>
+ */<br>
+<br>
+ nir_alu_type output_type;<br>
+<br>
+ /**<br>
+ * If per_component is false, the number of components in each input.<br>
+ */<br>
+ unsigned input_sizes[4];<br>
+<br>
+ /**<br>
+ * The type of vector that each input takes. Note that negate is only<br>
+ * allowed on inputs with int or float type, and behaves differently on the<br>
+ * two, and absolute value is only allowed on float type inputs.<br>
+ */<br>
+ nir_alu_type input_types[4];<br>
+} nir_op_info;<br>
+<br>
+extern const nir_op_info nir_op_infos[nir_num_opcodes];<br>
+<br>
+typedef struct nir_alu_instr {<br>
+ nir_instr instr;<br>
+ nir_op op;<br>
+ bool has_predicate;<br>
+ nir_src predicate;<br>
+ nir_alu_dest dest;<br>
+ nir_alu_src src[];<br>
+} nir_alu_instr;<br>
+<br>
+/* is this source channel used? */<br>
+static inline bool<br>
+nir_alu_instr_channel_used(<u></u>nir_alu_instr *instr, unsigned src, unsigned channel)<br>
+{<br>
+ if (nir_op_infos[instr->op].<u></u>input_sizes[src] > 0)<br>
+ return channel < nir_op_infos[instr->op].input_<u></u>sizes[src];<br>
+<br>
+ return (instr->dest.write_mask >> channel) & 1;<br>
</blockquote>
<br></div></div>
If per_component is false this should probably return true if any input channel is used?</blockquote><div><br></div><div>per_component is dead as of 136/133, we just use the input/output sizes so this is correct. <br><br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div><div>
<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
+}<br>
+<br>
+typedef enum {<br>
+ nir_deref_type_var,<br>
+ nir_deref_type_array,<br>
+ nir_deref_type_struct<br>
+} nir_deref_type;<br>
+<br>
+typedef struct nir_deref {<br>
+ nir_deref_type deref_type;<br>
+ struct nir_deref *child;<br>
+ const struct glsl_type *type;<br>
+} nir_deref;<br>
+<br>
+typedef struct {<br>
+ nir_deref deref;<br>
+<br>
+ nir_variable *var;<br>
+} nir_deref_var;<br>
+<br>
+typedef struct {<br>
+ nir_deref deref;<br>
+<br>
+ unsigned base_offset;<br>
+ bool has_indirect;<br>
+ nir_src indirect;<br>
+} nir_deref_array;<br>
+<br>
+typedef struct {<br>
+ nir_deref deref;<br>
+<br>
+ const char *elem;<br>
+} nir_deref_struct;<br>
+<br>
+#define nir_deref_as_var(_deref) exec_node_data(nir_deref_var, _deref, deref)<br>
+#define nir_deref_as_array(_deref) \<br>
+ exec_node_data(nir_deref_<u></u>array, _deref, deref)<br>
+#define nir_deref_as_struct(_deref) \<br>
+ exec_node_data(nir_deref_<u></u>struct, _deref, deref)<br>
+<br>
+typedef struct {<br>
+ nir_instr instr;<br>
+<br>
+ unsigned num_params;<br>
+ nir_deref_var **params;<br>
+ nir_deref_var *return_deref;<br>
+<br>
+ bool has_predicate;<br>
+ nir_src predicate;<br>
+<br>
+ struct nir_function_overload *callee;<br>
+} nir_call_instr;<br>
+<br>
+#define INTRINSIC(name, num_srcs, src_components, has_dest, dest_components, \<br>
+ num_variables, num_indices, flags) \<br>
+ nir_intrinsic_##name,<br>
+<br>
+#define LAST_INTRINSIC(name) nir_last_intrinsic = nir_intrinsic_##name,<br>
+<br>
+typedef enum {<br>
+#include "nir_intrinsics.h"<br>
+ nir_num_intrinsics = nir_last_intrinsic + 1<br>
+} nir_intrinsic_op;<br>
+<br>
+#undef INTRINSIC<br>
+#undef LAST_INTRINSIC<br>
+<br>
+typedef struct {<br>
+ nir_instr instr;<br>
+<br>
+ nir_intrinsic_op intrinsic;<br>
+<br>
+ nir_dest dest;<br>
+<br>
+ int const_index[3];<br>
+<br>
+ nir_deref_var *variables[2];<br>
+<br>
+ bool has_predicate;<br>
+ nir_src predicate;<br>
+<br>
+ nir_src src[];<br>
+} nir_intrinsic_instr;<br>
+<br>
+/**<br>
+ * \name NIR intrinsics semantic flags<br>
+ *<br>
+ * information about what the compiler can do with the intrinsics.<br>
+ *<br>
+ * \sa nir_intrinsic_info::flags<br>
+ */<br>
+/*@{*/<br>
+/**<br>
+ * whether the intrinsic can be safely eliminated if none of its register<br>
+ * outputs are being used.<br>
+ */<br>
+#define NIR_INTRINSIC_CAN_ELIMINATE (1 << 0)<br>
+<br>
</blockquote>
<br></div></div>
Make these enums, easier for debugging.<span><br>
<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
+/**<br>
+ * Whether the intrinsic can be reordered with respect to any other intrinsic,<br>
+ * i.e. whether the only reodering dependencies of the intrinsic are due to the<br>
</blockquote>
<br></span>
reordering<br>
<br>
Might want to note that even though an instruction itself is reorderable, it still may not cross another non-reorderable instruction such as a tessellation barrier instruction.<div><div><br>
<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
+ * register reads/writes.<br>
+ */<br>
+#define NIR_INTRINSIC_CAN_REORDER (1 << 1)<br>
+/*@}*/<br>
+<br>
+#define NIR_INTRINSIC_MAX_INPUTS 4<br>
+<br>
+typedef struct {<br>
+ const char *name;<br>
+<br>
+ unsigned num_srcs; /** < number of register/SSA inputs */<br>
+<br>
+ /** number of components of each input register */<br>
+ unsigned src_components[NIR_INTRINSIC_<u></u>MAX_INPUTS];<br>
+<br>
+ bool has_dest;<br>
+<br>
+ /** number of components of each output register */<br>
+ unsigned dest_components;<br>
+<br>
+ /** the number of inputs/outputs that are variables */<br>
+ unsigned num_variables;<br>
+<br>
+ /** the number of constant indices used by the intrinsic */<br>
+ unsigned num_indices;<br>
+<br>
+ /** semantic flags for calls to this intrinsic */<br>
+ unsigned flags;<br>
+} nir_intrinsic_info;<br>
+<br>
+extern const nir_intrinsic_info nir_intrinsic_infos[nir_num_<u></u>intrinsics];<br>
+<br>
+/**<br>
+ * \group texture information<br>
+ *<br>
+ * This gives semantic information about textures which is useful to the<br>
+ * frontend, the backend, and lowering passes, but not the optimizer.<br>
+ */<br>
+<br>
+typedef enum {<br>
+ nir_tex_src_coord,<br>
+ nir_tex_src_projector,<br>
+ nir_tex_src_comparitor, /* shadow comparitor */<br>
</blockquote>
<br></div></div>
comparator</blockquote><div><br></div><div>thanks<br></div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div><div><br>
<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
+ nir_tex_src_offset,<br>
+ nir_tex_src_bias,<br>
+ nir_tex_src_lod,<br>
+ nir_tex_src_ms_index, /* MSAA sample index */<br>
+ nir_tex_src_ddx,<br>
+ nir_tex_src_ddy,<br>
+ nir_tex_src_sampler_index, /* < dynamically uniform indirect index */<br>
+ nir_num_texinput_types<br>
+} nir_texinput_type;<br>
+<br>
+typedef enum {<br>
+ nir_texop_tex, /**< Regular texture look-up */<br>
+ nir_texop_txb, /**< Texture look-up with LOD bias */<br>
+ nir_texop_txl, /**< Texture look-up with explicit LOD */<br>
+ nir_texop_txd, /**< Texture look-up with partial derivatvies */<br>
+ nir_texop_txf, /**< Texel fetch with explicit LOD */<br>
+ nir_texop_txf_ms, /**< Multisample texture fetch */<br>
+ nir_texop_txs, /**< Texture size */<br>
+ nir_texop_lod, /**< Texture lod query */<br>
+ nir_texop_tg4, /**< Texture gather */<br>
+ nir_texop_query_levels /**< Texture levels query */<br>
+} nir_texop;<br>
+<br>
+typedef struct {<br>
+ nir_instr instr;<br>
+<br>
+ bool has_predicate;<br>
+ nir_src predicate;<br>
+<br>
+ enum glsl_sampler_dim sampler_dim;<br>
+ nir_alu_type dest_type;<br>
+<br>
+ nir_texop op;<br>
+ nir_dest dest;<br>
+ nir_src src[4];<br>
+ nir_texinput_type src_type[4];<br>
+ unsigned num_srcs, coord_components;<br>
</blockquote>
<br></div></div>
num_coord_components for consistency?<span><br>
<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
+ bool is_array, is_shadow;<br>
+<br>
+ /**<br>
+ * If is_shadow is true, whether this is the old-style shadow that outputs 4<br>
+ * components or the new-style shadow that outputs 1 component.<br>
+ */<br>
+ bool is_new_style_shadow;<br>
</blockquote>
<br></span>
is_four_component_shadow would be clearer, "new" will get old over the lifespan of this code...</blockquote><div><br></div><div>Point.<br></div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div><div>
<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
+<br>
+ /* constant offset - must be 0 if the offset source is used */<br>
+ int const_offset[4];<br>
+<br>
+ /* gather component selector */<br>
+ unsigned component : 2;<br>
+<br>
+ unsigned sampler_index;<br>
+ nir_deref_var *sampler; /* if this is NULL, use sampler_index instead */<br>
+} nir_tex_instr;<br>
+<br>
+static inline unsigned<br>
+nir_tex_instr_dest_size(nir_<u></u>tex_instr *instr)<br>
+{<br>
+ if (instr->op == nir_texop_txs) {<br>
+ unsigned ret;<br>
+ switch (instr->sampler_dim) {<br>
+ case GLSL_SAMPLER_DIM_1D:<br>
+ case GLSL_SAMPLER_DIM_BUF:<br>
+ ret = 1;<br>
+ break;<br>
+ case GLSL_SAMPLER_DIM_2D:<br>
+ case GLSL_SAMPLER_DIM_CUBE:<br>
+ case GLSL_SAMPLER_DIM_MS:<br>
+ case GLSL_SAMPLER_DIM_RECT:<br>
+ case GLSL_SAMPLER_DIM_EXTERNAL:<br>
+ ret = 2;<br>
+ break;<br>
+ case GLSL_SAMPLER_DIM_3D:<br>
+ ret = 3;<br>
+ break;<br>
+ default:<br>
+ assert(0);<br>
+ break;<br>
+ }<br>
+ if (instr->is_array)<br>
+ ret++;<br>
+ return ret;<br>
+ }<br>
+<br>
+ if (instr->op == nir_texop_query_levels)<br>
+ return 2;<br>
+<br>
+ if (instr->is_shadow && instr->is_new_style_shadow)<br>
+ return 1;<br>
+<br>
+ return 4;<br>
+}<br>
+<br>
+static inline unsigned<br>
+nir_tex_instr_src_size(nir_<u></u>tex_instr *instr, unsigned src)<br>
+{<br>
+ if (instr->src_type[src] == nir_tex_src_coord)<br>
+ return instr->coord_components;<br>
+<br>
+<br>
+ if (instr->src_type[src] == nir_tex_src_offset ||<br>
+ instr->src_type[src] == nir_tex_src_ddx ||<br>
+ instr->src_type[src] == nir_tex_src_ddy) {<br>
+ if (instr->is_array)<br>
+ return instr->coord_components - 1;<br>
+ else<br>
+ return instr->coord_components;<br>
</blockquote>
<br></div></div>
Please avoid mixing 0 and 1 indexing in the same variable if at all possible.<span><br></span></blockquote><div><br></div><div>That's not really what's going on here. In GLSL, if you have an ND array texture, it takes (N+1)D coordinates. You could make the case that NIR should use an extra source for this, but there's precedent.<br></div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span>
<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
+ }<br>
+<br>
+ return 1;<br>
+}<br>
+<br>
+static inline int<br>
+nir_tex_instr_src_index(nir_<u></u>tex_instr *instr, nir_texinput_type type)<br>
+{<br>
+ for (unsigned i = 0; i < instr->num_srcs; i++)<br>
+ if (instr->src_type[i] == type)<br>
+ return (int) i;<br>
+<br>
+ return -1;<br>
+}<br>
+<br>
+typedef struct {<br>
+ union {<br>
+ float f[4];<br>
+ int32_t i[4];<br>
+ uint32_t u[4];<br>
+ };<br>
+} nir_const_value;<br>
+<br>
</blockquote>
<br></span>
Matrix types have been lowered before these are constructed? Comparing with nir_constant_data which can fit a mat4.</blockquote><div><br></div><div>Yes.<br></div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div><div><br>
<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
+typedef struct {<br>
+ nir_instr instr;<br>
+<br>
+ union {<br>
+ nir_const_value value;<br>
+ nir_const_value *array;<br>
+ };<br>
+<br>
+ unsigned num_components;<br>
+<br>
+ /**<br>
+ * The number of constant array elements to be copied into the variable. If<br>
+ * this != 0, then value.array holds the array of size array_elems;<br>
+ * otherwise, value.value holds the single vector constant (the more common<br>
+ * case, and the only case for SSA destinations).<br>
+ */<br>
+ unsigned array_elems;<br>
+<br>
+ bool has_predicate;<br>
+ nir_src predicate;<br>
+<br>
+ nir_dest dest;<br>
+} nir_load_const_instr;<br>
+<br>
+typedef enum {<br>
+ nir_jump_return,<br>
+ nir_jump_break,<br>
+ nir_jump_continue,<br>
+} nir_jump_type;<br>
+<br>
+typedef struct {<br>
+ nir_instr instr;<br>
+ nir_jump_type type;<br>
+} nir_jump_instr;<br>
+<br>
+/* creates a new SSA variable in an undefined state */<br>
+<br>
+typedef struct {<br>
+ nir_instr instr;<br>
+ nir_ssa_def def;<br>
+} nir_ssa_undef_instr;<br>
+<br>
+typedef struct {<br>
+ struct exec_node node;<br>
+ struct nir_block *pred;<br>
</blockquote>
<br></div></div>
A comment that pred here = predecessor.<span><br></span></blockquote><div><br></div><div>Sure<br><br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span>
<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
+ nir_src src;<br>
+} nir_phi_src;<br>
+<br>
+typedef struct {<br>
+ nir_instr instr;<br>
+<br>
+ struct exec_list srcs;<br>
</blockquote>
<br></span>
Comment on real type of srcs. Same for all the exec_list/set/hash_table types in this file that don't already have such comments.<span><br></span></blockquote><div><br></div><div>Probably a good idea.<br></div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span>
<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
+ nir_dest dest;<br>
+} nir_phi_instr;<br>
+<br>
+#define nir_instr_as_alu(_instr) exec_node_data(nir_alu_instr, _instr, instr)<br>
+#define nir_instr_as_call(_instr) exec_node_data(nir_call_instr, _instr, instr)<br>
+#define nir_instr_as_jump(_instr) exec_node_data(nir_jump_instr, _instr, instr)<br>
+#define nir_instr_as_texture(_instr) \<br>
+ exec_node_data(nir_tex_instr, _instr, instr)<br>
</blockquote>
<br></span>
Note actual type name "tex" but function "texture". Slightly confusing.</blockquote><div><br></div><div>Fixed in a later patch<br></div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div><div><br>
<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
+#define nir_instr_as_intrinsic(_instr) \<br>
+ exec_node_data(nir_intrinsic_<u></u>instr, _instr, instr)<br>
+#define nir_instr_as_load_const(_<u></u>instr) \<br>
+ exec_node_data(nir_load_const_<u></u>instr, _instr, instr)<br>
+#define nir_instr_as_ssa_undef(_instr) \<br>
+ exec_node_data(nir_ssa_undef_<u></u>instr, _instr, instr)<br>
+#define nir_instr_as_phi(_instr) \<br>
+ exec_node_data(nir_phi_instr, _instr, instr)<br>
+<br>
+<br>
+/*<br>
+ * Control flow<br>
+ *<br>
+ * Control flow consists of a tree of control flow nodes, which include<br>
+ * if-statements and loops. The leaves of the tree are basic blocks, lists of<br>
+ * instructions that always run start-to-finish. Each basic block also keeps<br>
+ * track of its successors (blocks which may run immediately after the current<br>
+ * block) and predecessors (blocks which could have run immediately before the<br>
+ * current block). Each function also has a start block and an end block which<br>
+ * all return statements point to (which is always empty). Together, all the<br>
+ * blocks with their predecessors and successors make up the control flow<br>
+ * graph (CFG) of the function. There are helpers that modify the tree of<br>
+ * control flow nodes while modifying the CFG appropriately; these should be<br>
+ * used instead of modifying the tree directly.<br>
+ */<br>
+<br>
+typedef enum {<br>
+ nir_cf_node_block,<br>
+ nir_cf_node_if,<br>
+ nir_cf_node_loop,<br>
+ nir_cf_node_function<br>
+} nir_cf_node_type;<br>
+<br>
+typedef struct nir_cf_node {<br>
+ struct exec_node node;<br>
+ nir_cf_node_type type;<br>
+ struct nir_cf_node *parent;<br>
+} nir_cf_node;<br>
+<br>
+typedef struct nir_block {<br>
+ nir_cf_node cf_node;<br>
+ struct exec_list instr_list;<br>
+<br>
+ unsigned index;<br>
</blockquote>
<br></div></div>
Deserves a comment what this is<span><br></span></blockquote><div><br></div><div>Sure<br></div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span>
<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
+<br>
+ /*<br>
+ * Each block can only have up to 2 successors, so we put them in a simple<br>
+ * array - no need for anything more complicated.<br>
+ */<br>
+ struct nir_block *successors[2];<br>
+<br>
+ struct set *predecessors;<br>
</blockquote>
<br></span>
Why a set and not a simple array/list?</blockquote><div><br></div><div>a) Ordering doesn't matter and that's explicit here. b) This gets modified on-the-fly every time the CFG gets modified so we want quick insertion/removal (not array) and c) a set really isn't that much more complex than a linked list given that we already have the datastructure.<br></div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div><div><br>
<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
+} nir_block;<br>
+<br>
+#define nir_block_first_instr(block) \<br>
+ exec_node_data(nir_instr, exec_list_get_head(&(block)-><u></u>instr_list), node)<br>
+#define nir_block_last_instr(block) \<br>
+ exec_node_data(nir_instr, exec_list_get_tail(&(block)-><u></u>instr_list), node)<br>
+<br>
+#define nir_foreach_instr(block, instr) \<br>
+ foreach_list_typed(nir_instr, instr, node, &(block)->instr_list)<br>
+#define nir_foreach_instr_reverse(<u></u>block, instr) \<br>
+ foreach_list_typed_reverse(<u></u>nir_instr, instr, node, &(block)->instr_list)<br>
+#define nir_foreach_instr_safe(block, instr) \<br>
+ foreach_list_typed_safe(nir_<u></u>instr, instr, node, &(block)->instr_list)<br>
+<br>
+typedef struct {<br>
+ nir_cf_node cf_node;<br>
+ nir_src condition;<br>
+ struct exec_list then_list;<br>
+ struct exec_list else_list;<br>
+} nir_if;<br>
+<br>
+#define nir_if_first_then_node(if) \<br>
+ exec_node_data(nir_cf_node, exec_list_get_head(&(if)-><u></u>then_list), node)<br>
+#define nir_if_last_then_node(if) \<br>
+ exec_node_data(nir_cf_node, exec_list_get_tail(&(if)-><u></u>then_list), node)<br>
+#define nir_if_first_else_node(if) \<br>
+ exec_node_data(nir_cf_node, exec_list_get_head(&(if)-><u></u>else_list), node)<br>
+#define nir_if_last_else_node(if) \<br>
+ exec_node_data(nir_cf_node, exec_list_get_tail(&(if)-><u></u>else_list), node)<br>
+<br>
+typedef struct {<br>
+ nir_cf_node cf_node;<br>
+ struct exec_list body;<br>
+} nir_loop;<br>
+<br>
+#define nir_loop_first_cf_node(loop) \<br>
+ exec_node_data(nir_cf_node, exec_list_get_head(&(loop)-><u></u>body), node)<br>
+#define nir_loop_last_cf_node(loop) \<br>
+ exec_node_data(nir_cf_node, exec_list_get_tail(&(loop)-><u></u>body), node)<br>
+<br>
+typedef struct {<br>
+ nir_cf_node cf_node;<br>
+<br>
+ /** pointer to the overload of which this is an implementation */<br>
+ struct nir_function_overload *overload;<br>
+<br>
+ struct exec_list body; /** < list of nir_cf_node */<br>
+<br>
+ nir_block *start_block, *end_block;<br>
+<br>
+ /** list for all local variables in the function */<br>
+ struct exec_list locals;<br>
+<br>
+ /** array of variables used as parameters */<br>
+ unsigned num_params;<br>
+ nir_variable **params;<br>
+<br>
+ /** variable used to hold the result of the function */<br>
+ nir_variable *return_var;<br>
+<br>
+ /** list of local registers in the function */<br>
+ struct exec_list registers;<br>
+<br>
+ /** next available local register index */<br>
+ unsigned reg_alloc;<br>
+<br>
+ /** next available SSA value index */<br>
+ unsigned ssa_alloc;<br>
+<br>
+ /* total number of basic blocks, only valid when block_index_dirty = false */<br>
+ unsigned num_blocks;<br>
+<br>
+ bool block_index_dirty;<br>
+} nir_function_impl;<br>
+<br>
+#define nir_cf_node_next(_node) \<br>
+ exec_node_data(nir_cf_node, exec_node_get_next(&(_node)-><u></u>node), node)<br>
+<br>
+#define nir_cf_node_prev(_node) \<br>
+ exec_node_data(nir_cf_node, exec_node_get_prev(&(_node)-><u></u>node), node)<br>
+<br>
+#define nir_cf_node_is_first(_node) \<br>
+ exec_node_is_head_sentinel((_<u></u>node)->node.prev)<br>
+<br>
+#define nir_cf_node_is_last(_node) \<br>
+ exec_node_is_tail_sentinel((_<u></u>node)->node.next)<br>
+<br>
+#define nir_cf_node_as_block(node) \<br>
+ exec_node_data(nir_block, node, cf_node)<br>
+<br>
+#define nir_cf_node_as_if(node) \<br>
+ exec_node_data(nir_if, node, cf_node)<br>
+<br>
+#define nir_cf_node_as_loop(node) \<br>
+ exec_node_data(nir_loop, node, cf_node)<br>
+<br>
+#define nir_cf_node_as_function(node) \<br>
+ exec_node_data(nir_function_<u></u>impl, node, cf_node)<br>
+<br>
+typedef enum {<br>
+ nir_parameter_in,<br>
+ nir_parameter_out,<br>
+ nir_parameter_inout,<br>
+} nir_parameter_type;<br>
+<br>
+typedef struct {<br>
+ nir_parameter_type param_type;<br>
+ const struct glsl_type *type;<br>
+} nir_parameter;<br>
+<br>
+typedef struct nir_function_overload {<br>
+ struct exec_node node;<br>
+<br>
+ unsigned num_params;<br>
+ nir_parameter *params;<br>
+ const struct glsl_type *return_type;<br>
+<br>
+ nir_function_impl *impl; /** < NULL if the overload is only declared yet */<br>
+<br>
+ /** pointer to the function of which this is an overload */<br>
+ struct nir_function *function;<br>
+} nir_function_overload;<br>
+<br>
+typedef struct nir_function {<br>
+ struct exec_node node;<br>
+<br>
+ struct exec_list overload_list;<br>
+ const char *name;<br>
+} nir_function;<br>
+<br>
+#define nir_function_first_overload(<u></u>func) \<br>
+ exec_node_data(nir_function_<u></u>overload, \<br>
+ exec_list_get_head(&(func)-><u></u>overload_list), node)<br>
+<br>
+typedef struct nir_shader {<br>
+ /** hash table of name -> uniform */<br>
+ struct hash_table *uniforms;<br>
+<br>
+ /** hash table of name -> input */<br>
+ struct hash_table *inputs;<br>
+<br>
+ /** hash table of name -> output */<br>
+ struct hash_table *outputs;<br>
+<br>
+ /** list of global variables in the shader */<br>
+ struct exec_list globals;<br>
+<br>
+ struct exec_list system_values;<br>
+<br>
+ struct exec_list functions;<br>
+<br>
+ /** list of global registers in the shader */<br>
+ struct exec_list registers;<br>
+<br>
+ /** structures used in this shader */<br>
+ unsigned num_user_structures;<br>
+ struct glsl_type **user_structures;<br>
+<br>
+ /** next available global register index */<br>
+ unsigned reg_alloc;<br>
+} nir_shader;<br>
+<br>
+#define nir_foreach_overload(shader, overload) \<br>
+ foreach_list_typed(nir_<u></u>function, func, node, &(shader)->functions) \<br>
+ foreach_list_typed(nir_<u></u>function_overload, overload, node, \<br>
+ &(func)->overload_list)<br>
+<br>
+#ifdef __cplusplus<br>
+} /* extern "C" */<br>
+#endif<br>
</blockquote>
<br></div></div>
... snipped for brevity, no comments on remaining files of patch, LGTM<br>
<br>
<br>
Above is really only nitpicks, ignore at will. Kudos for Connor/Jason for NIR, very impressive!<br>
<br>
Note i have reviewed the dd1f3c71f2d5253a69469f55cfa53d<u></u>c3e5793b80 version at<br>
<a href="http://cgit.freedesktop.org/~jekstrand/mesa/tree/src/glsl/nir/nir.h?h=review/nir-v1" target="_blank">http://cgit.freedesktop.org/~<u></u>jekstrand/mesa/tree/src/glsl/<u></u>nir/nir.h?h=review/nir-v1</a><br>
rather than the exact version sent out to ml.<br>
<br>
Patch 004 is<br>
Reviewed-By glenn.kennard <<a href="mailto:glenn.kennard@gmail.com" target="_blank">glenn.kennard@gmail.com</a>><br></blockquote><div><br></div><div>Thanks. I think I've addressed most of your comments. I decided to leave some of it alone for now. We've got quite a few patches on the list at this point and the little fixups can wait.<br></div><div>--Jason<br></div><div> </div></div></div></div>