[Mesa-dev] [PATCH RFC 07/11] glsl: add SSA infrastructure

Connor Abbott cwabbott0 at gmail.com
Wed Feb 5 09:25:29 PST 2014


On Tue, Jan 28, 2014 at 1:50 PM, Paul Berry <stereotype441 at gmail.com> wrote:

> On 22 January 2014 09:16, Connor Abbott <cwabbott0 at gmail.com> wrote:
>
>> This patch introduces all the changes to the IR that are necessary for
>> representing programs in the SSA form. This consists of a new variable
>> mode, the SSA temporary, which is guarenteed to be written to exactly
>> once, and classes to represent phi nodes in the IR.
>>
>> In the current code, variables are first declared using an ir_variable
>> instruction inserted into the instruction stream, and then every
>> dereference will point to the ir_variable declared earlier. SSA
>> temporaries, however, do not work this way. Instead, the variable
>> is declared when it is assigned. That is, the variable is "owned" by
>> the one and only instruction where it is defined.
>>
>> In SSA, phi nodes may exist at the beginning of any "join nodes," or
>> basic blocks with more than one predecessor. In our IR, this can happen
>> in one of three places:
>>
>> - After an if statement, where the then branch and the else branch
>> converge.
>> - At the beginning of a loop, which can be reached from either before
>> the loop (on the first iteration), the end of the loop (when we get to
>> the end of the loop and jump back to the beginning), or any continue
>> statement.
>> - At the end of a loop, which can be reached from any break statement
>> within the loop.
>>
>> Accordingly, there are three different types of phi nodes: if phi nodes,
>> phi nodes at the beginning of a loop, and phi nodes at the end of a
>> loop, all of which derive from the ir_phi base class.
>> ---
>>  src/glsl/ir.cpp                                |  56 +++++++
>>  src/glsl/ir.h                                  | 196
>> ++++++++++++++++++++++++-
>>  src/glsl/ir_clone.cpp                          | 147 ++++++++++++++++---
>>  src/glsl/ir_hierarchical_visitor.cpp           |  36 +++++
>>  src/glsl/ir_hierarchical_visitor.h             |  11 ++
>>  src/glsl/ir_hv_accept.cpp                      |  55 ++++++-
>>  src/glsl/ir_print_visitor.cpp                  | 196
>> ++++++++++++++++++++++++-
>>  src/glsl/ir_print_visitor.h                    |  15 ++
>>  src/glsl/ir_validate.cpp                       | 158 +++++++++++++++++++-
>>  src/glsl/ir_visitor.h                          |   8 +
>>  src/mesa/drivers/dri/i965/brw_fs.h             |   4 +
>>  src/mesa/drivers/dri/i965/brw_fs_visitor.cpp   |  28 ++++
>>  src/mesa/drivers/dri/i965/brw_vec4.h           |   4 +
>>  src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp |  24 +++
>>  src/mesa/program/ir_to_mesa.cpp                |  28 ++++
>>  src/mesa/state_tracker/st_glsl_to_tgsi.cpp     |  29 ++++
>>  16 files changed, 956 insertions(+), 39 deletions(-)
>>
>> diff --git a/src/glsl/ir.cpp b/src/glsl/ir.cpp
>> index 1a36bd6..f1ded80 100644
>> --- a/src/glsl/ir.cpp
>> +++ b/src/glsl/ir.cpp
>> @@ -1229,6 +1229,37 @@ ir_loop::ir_loop()
>>  }
>>
>>
>> +ir_phi::ir_phi()
>> +{
>> +   this->dest = NULL;
>> +}
>>
>
> Rather than have a no-argument constructor for ir_phi that sets this->dest
> to NULL, and then have the derived classes set dest to the appropriate
> value, let's just do:
>
> ir_phi::ir_phi(ir_variable *dest)
>    : dest(dest)
> {
> }
>
> Then in the derived classes we can just pass dest on through, e.g.:
>
> ir_phi_if::ir_phi_if(ir_variable *dest, ir_variable *if_src,
>              ir_variable *else_src)
>    : ir_phi(dest), if_src(if_src), else_src(else_src)
> {
>    this->ir_type = ir_type_phi_if;
> }
>
>
>
>> +
>> +
>> +ir_phi_if::ir_phi_if(ir_variable *dest, ir_variable *if_src,
>> +                    ir_variable *else_src)
>> +   : if_src(if_src), else_src(else_src)
>> +{
>> +   this->ir_type = ir_type_phi_if;
>> +   this->dest = dest;
>> +}
>> +
>> +
>> +ir_phi_loop_begin::ir_phi_loop_begin(ir_variable* dest, ir_variable*
>> enter_src,
>> +                                    ir_variable* repeat_src)
>> +   : enter_src(enter_src), repeat_src(repeat_src)
>> +{
>> +   this->ir_type = ir_type_phi_loop_begin;
>> +   this->dest = dest;
>> +}
>> +
>> +
>> +ir_phi_loop_end::ir_phi_loop_end(ir_variable *dest)
>> +{
>> +   this->ir_type = ir_type_phi_loop_end;
>> +   this->dest = dest;
>> +}
>> +
>> +
>>  ir_dereference_variable::ir_dereference_variable(ir_variable *var)
>>  {
>>     assert(var != NULL);
>> @@ -1554,6 +1585,9 @@ ir_variable::ir_variable(const struct glsl_type
>> *type, const char *name,
>>     this->data.max_array_access = 0;
>>     this->data.atomic.buffer_index = 0;
>>     this->data.atomic.offset = 0;
>> +   this->ssa_assignment = NULL;
>> +   this->ssa_phi = NULL;
>> +   this->ssa_call = NULL;
>>
>>     if (type != NULL) {
>>        if (type->base_type == GLSL_TYPE_SAMPLER)
>> @@ -1722,12 +1756,19 @@ steal_memory(ir_instruction *ir, void *new_ctx)
>>  {
>>     ir_variable *var = ir->as_variable();
>>     ir_constant *constant = ir->as_constant();
>> +   ir_dereference_variable *deref = ir->as_dereference_variable();
>> +   ir_phi *phi = ir->as_phi();
>> +   ir_phi_loop_begin *phi_loop_begin = ir->as_phi_loop_begin();
>> +   ir_phi_loop_end *phi_loop_end = ir->as_phi_loop_end();
>>     if (var != NULL && var->constant_value != NULL)
>>        steal_memory(var->constant_value, ir);
>>
>>     if (var != NULL && var->constant_initializer != NULL)
>>        steal_memory(var->constant_initializer, ir);
>>
>> +   if (deref != NULL && deref->var->data.mode == ir_var_temporary_ssa)
>> +      steal_memory(deref->var, ir);
>> +
>>
>
> I think this is rendundant.  Aren't SSA variables already stolen by the
> call to steal_memory(phi->dest, new_ctx) below?
>
>
>>     /* The components of aggregate constants are not visited by the normal
>>      * visitor, so steal their values by hand.
>>      */
>> @@ -1744,6 +1785,21 @@ steal_memory(ir_instruction *ir, void *new_ctx)
>>        }
>>     }
>>
>> +   if (phi != NULL)
>> +      steal_memory(phi->dest, new_ctx);
>> +
>> +   if (phi_loop_begin != NULL) {
>> +      foreach_list(n, &phi_loop_begin->continue_srcs) {
>> +        ralloc_steal(new_ctx, n);
>> +      }
>> +   }
>> +
>> +   if (phi_loop_end != NULL) {
>> +      foreach_list(n, &phi_loop_end->break_srcs) {
>> +        ralloc_steal(new_ctx, n);
>> +      }
>> +   }
>> +
>>     ralloc_steal(new_ctx, ir);
>>  }
>>
>> diff --git a/src/glsl/ir.h b/src/glsl/ir.h
>> index d1e790d..8af8eed 100644
>> --- a/src/glsl/ir.h
>> +++ b/src/glsl/ir.h
>> @@ -78,6 +78,9 @@ enum ir_node_type {
>>     ir_type_if,
>>     ir_type_loop,
>>     ir_type_loop_jump,
>> +   ir_type_phi_if,
>> +   ir_type_phi_loop_begin,
>> +   ir_type_phi_loop_end,
>>     ir_type_return,
>>     ir_type_swizzle,
>>     ir_type_texture,
>> @@ -139,6 +142,10 @@ public:
>>     virtual class ir_discard *           as_discard()          { return
>> NULL; }
>>     virtual class ir_jump *              as_jump()             { return
>> NULL; }
>>     virtual class ir_loop_jump *         as_loop_jump()        { return
>> NULL; }
>> +   virtual class ir_phi *               as_phi()              { return
>> NULL; }
>> +   virtual class ir_phi_if *            as_phi_if()           { return
>> NULL; }
>> +   virtual class ir_phi_loop_begin *    as_phi_loop_begin()   { return
>> NULL; }
>> +   virtual class ir_phi_loop_end *      as_phi_loop_end()     { return
>> NULL; }
>>     /*@}*/
>>
>>     /**
>> @@ -289,10 +296,11 @@ enum ir_variable_mode {
>>     ir_var_function_in,
>>     ir_var_function_out,
>>     ir_var_function_inout,
>> -   ir_var_const_in,    /**< "in" param that must be a constant
>> expression */
>> -   ir_var_system_value, /**< Ex: front-face, instance-id, etc. */
>> -   ir_var_temporary,   /**< Temporary variable generated during
>> compilation. */
>> -   ir_var_mode_count   /**< Number of variable modes */
>> +   ir_var_const_in,     /**< "in" param that must be a constant
>> expression */
>> +   ir_var_system_value,  /**< Ex: front-face, instance-id, etc. */
>> +   ir_var_temporary,    /**< Temporary variable generated during
>> compilation. */
>> +   ir_var_temporary_ssa, /**< Temporary variable with only one
>> definition. */
>> +   ir_var_mode_count    /**< Number of variable modes */
>>  };
>>
>>  /**
>> @@ -736,6 +744,43 @@ public:
>>      */
>>     ir_constant *constant_initializer;
>>
>> +   /**
>> +    * Assignment that creates this variable, if it is an SSA temporary
>> +    *
>> +    * SSA variables are declared in the assignment/phi node/return where
>> they
>> +    * are defined, unlike everything else where the lhs defereference
>> always
>> +    * points to an ir_variable somewhere else in the ir tree. Storing the
>> +    * parent assignment here allows us to more easily traverse the
>> use-def
>> +    * chains during optimizations.
>> +    *
>> +    * \sa ir_variable::ssa_phi
>> +    * \sa ir_variable::ssa_call
>> +    */
>> +
>> +   ir_assignment *ssa_assignment;
>>
>
> Nit pick: normally we don't put a blank line between the comment and the
> variable it's documenting.
>
>
>> +
>> +   /**
>> +    * Phi node that creates this variable, if it is an SSA temporary
>> +    *
>> +    * Note: if this variable is an SSA temporary, then one of this field,
>> +    * \c ::ssa_assignment, or \c ::ssa_call is non-null, but not both.
>>
>
> Actually, I think the invariant is:
> - If this variable is an SSA temporary, then exactly one of
> ssa_assignment, ssa_phi, or ssa_call is non-null.
> - If this variable is not an SSA temporary, then all of ssa_assignment,
> ssa_phi, and ssa_call are null.
>
>
>> +    *
>> +    * \sa ir_variable::ssa_assignment
>> +    * \sa ir_variable::ssa_call
>> +    */
>> +
>> +   ir_phi *ssa_phi;
>> +
>> +   /**
>> +    * Call whose return dereference creates this variable, if it is an
>> SSA
>> +    * temporary
>> +    *
>> +    * \sa ir_variable::ssa_assignment
>> +    * \sa ir_variable::ssa_phi
>> +    */
>> +
>> +   ir_call *ssa_call;
>>
>
> Rather than have separate ir_assignment, ir_phi, and ir_call pointers, at
> most one of which is ever non-null, I wonder if it would be better to have
> a single pointer to an ir_instruction.  It would use less memory, and more
> importantly, it would be less likely to be misused by mistake, because the
> invariant is less complex.
>
>
>> +
>>  private:
>>     /**
>>      * For variables that are in an interface block or are an instance of
>> an
>> @@ -987,6 +1032,8 @@ public:
>>     exec_list  then_instructions;
>>     /** List of ir_instruction for the body of the else branch */
>>     exec_list  else_instructions;
>> +   /** List of phi nodes at the end of the if */
>> +   exec_list phi_nodes;
>>  };
>>
>>
>> @@ -1013,6 +1060,147 @@ public:
>>
>>     /** List of ir_instruction that make up the body of the loop. */
>>     exec_list body_instructions;
>> +
>> +   /** List of phi nodes at the beginning of the body of the loop. */
>> +   exec_list begin_phi_nodes;
>> +   /** List of phi nodes immediately after the loop. */
>> +   exec_list end_phi_nodes;
>> +};
>> +
>> +/**
>> + * Base class for instructions representing phi nodes
>> + *
>> + * There are three join points where phi nodes can occur:
>> + * * after an if statement
>> + * * at the beginning of a loop
>> + * * after a loop
>> + *
>> + * Each of these points has an associated subclass of ir_phi with places
>> to
>> + * store the input variables corresponding to each predecessor basic
>> block.
>> + * Note that an input may be null, in which case the value coming from
>> that
>> + * basic block is undefined. A null input conveys important information
>> about
>> + * the original program before the conversion to SSA, and so it *cannot*
>> be
>> + * optimized away (see "Global Code Motion Global Value Numbering" by
>> Click for
>> + * details).
>>
>
> It would be nice to refer specifically to the part of the paper explaining
> why nulls need to be preserved (figure 3).  It would also be nice to have a
> sentence or two explaining what important information is conveyed by a null
> input, for people who don't want to go read the paper (it's a good paper,
> though.  Everyone go read it).  My current undertsanding, and please
> correct me if I'm wrong, is that since the global code motion algorithm
> schedules each instruction solely based on data dependencies (ignoring the
> basic block that the instruction originally resided in), it's necessary for
> the data dependencies to reflect all of the control flow information
> present in the original program.  In cases where the dominance tree alone
> is not sufficient to prove that a variable is initialized before it is used
> (such as where a variable is initialized in an "if (first_iteration)" block
> inside a loop), if we removed nulls from phi nodes, the data dependencies
> would no longer be sufficient to reconstruct the control flow.
>
> I'd also recommend adding a piglit test based on figure 3, so that we can
> verify that null inputs to phi nodes really are properly preserved.
>
>
>> + */
>> +
>> +
>> +class ir_phi : public ir_instruction {
>> +public:
>> +   virtual ir_phi *clone(void *mem_ctx, hash_table *ht) const;
>> +
>> +   virtual void accept(ir_visitor *v)
>> +   {
>> +      v->visit(this);
>> +   }
>> +
>> +   virtual ir_visitor_status accept(ir_hierarchical_visitor *);
>> +
>> +   virtual ir_phi *as_phi()
>> +   {
>> +      return this;
>> +   }
>> +
>> +   ir_variable *dest;
>> +
>> +protected:
>> +   ir_phi();
>> +};
>> +
>> +
>> +class ir_phi_if : public ir_phi {
>> +public:
>> +   ir_phi_if(ir_variable *dest, ir_variable *if_src, ir_variable
>> *else_src);
>> +
>> +   virtual ir_phi_if *clone(void *mem_ctx, hash_table *ht) const;
>> +
>> +   virtual void accept(ir_visitor *v)
>> +   {
>> +      v->visit(this);
>> +   }
>> +
>> +   virtual ir_visitor_status accept(ir_hierarchical_visitor *);
>> +
>> +   virtual ir_phi_if *as_phi_if()
>> +   {
>> +      return this;
>> +   }
>> +
>> +   /** the value dest takes if the if branch is taken */
>> +   ir_variable *if_src;
>> +   /** the value dest takes if the if branch is not taken */
>> +   ir_variable *else_src;
>>
>
> Elsewhere in the code we refer to the two branches as the "then branch"
> and the "else branch".  Can we rename if_src to then_src (and reword the
> comments accordingly)?
>
>
>> +};
>> +
>> +
>> +/**
>> + * Represents a phi node source from a loop jump instruction (break or
>> continue)
>> + */
>> +
>> +struct ir_phi_jump_src : public exec_node {
>> +   ir_loop_jump *jump;
>> +   ir_variable *src;
>> +};
>> +
>> +
>> +class ir_phi_loop_begin : public ir_phi {
>> +public:
>> +   ir_phi_loop_begin(ir_variable *dest, ir_variable *enter_src,
>> ir_variable *repeat_src);
>> +
>> +   virtual ir_phi_loop_begin *clone(void *mem_ctx, hash_table *ht) const;
>> +
>> +   virtual void accept(ir_visitor *v)
>> +   {
>> +      v->visit(this);
>> +   }
>> +
>> +   virtual ir_visitor_status accept(ir_hierarchical_visitor *);
>> +
>> +   virtual ir_phi_loop_begin *as_phi_loop_begin()
>> +   {
>> +      return this;
>> +   }
>> +
>> +   /** the value dest takes on the first iteration of the loop */
>> +   ir_variable *enter_src;
>> +
>> +   /**
>> +    * The value dest takes after reaching the end of the loop and going
>> back to
>> +    * the beginning.
>> +    */
>> +   ir_variable *repeat_src;
>> +
>> +   /**
>> +    * A list of ir_phi_jump_src structures representing the value dest
>> will take
>> +    * after each continue.
>> +    */
>> +   exec_list continue_srcs;
>> +};
>> +
>> +
>> +class ir_phi_loop_end : public ir_phi {
>> +public:
>> +   ir_phi_loop_end(ir_variable *dest);
>> +
>> +   virtual ir_phi_loop_end *clone(void *mem_ctx, hash_table *ht) const;
>> +
>> +   virtual void accept(ir_visitor *v)
>> +   {
>> +      v->visit(this);
>> +   }
>> +
>> +   virtual ir_visitor_status accept(ir_hierarchical_visitor *);
>> +
>> +   virtual ir_phi_loop_end *as_phi_loop_end()
>> +   {
>> +      return this;
>> +   }
>> +
>> +   /**
>> +    * A list of ir_phi_jump_src structures representing the value dest
>> will take
>> +    * after each break.
>> +    */
>> +   exec_list break_srcs;
>>  };
>>
>
> That's my review of ir.cpp and ir.h.  I'll follow up with more review
> comments in another email.
>
> Oh, and before I forget:
>
> This patch also needs to update unit tests.  When I run "make check" I get
> the following error message:
>
>
> ============================================================================
> Testsuite summary for Mesa 10.1.0-devel
>
> ============================================================================
> # TOTAL: 6
> # PASS:  5
> # SKIP:  0
> # XFAIL: 0
> # FAIL:  1
> # XPASS: 0
> # ERROR: 0
>
> ============================================================================
> See src/glsl/test-suite.log
> Please report to https://bugs.freedesktop.org/enter_bug.cgi?product=Mesa
>
> ============================================================================
>
> Looking at src/glsl/test-suite.log, the failures are in
> tests/lower_jumps/.  It looks like you just need to update the *.expected
> files to reflect the new fields in "if" and "loop" instructions.
>

So it seemed to me like updating the unit tests would be a bit more work,
which is why I pushed it off (and made a note of that in the cover letter).
I guess fixing the *.expected tests would keep the tests from breaking, but
what really needs to happen AFAIU is

1) Updating ir_reader to match the changes I made to the printer so that it
supports SSA
2) Fixing input and expected files in existing tests, since we've now
changed s-exprs a little (which is why the tests fail now)
3) Adding some unit tests for the conversion to (and maybe from) SSA to
exercise some of the edge cases there

1 and 2 are independent changes, but if we don't do them both in this
commit then we'll break "make check"... do I split them out into separate
commits?

Connor
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.freedesktop.org/archives/mesa-dev/attachments/20140205/fd21d7de/attachment-0001.html>


More information about the mesa-dev mailing list