[Mesa-dev] [PATCH 5/5] glsl: Pack flat "varyings" of mixed types together.

Mon Jan 7 13:25:00 PST 2013

On 01/02/2013 10:59 AM, Paul Berry wrote:
> This patch enhances the varying packing code so that flat varyings of
> uint, int, and float types can be packed together.
>
> We accomplish this in lower_packed_varyings.cpp by making the type of
> all flat varyings ivec4, and then using information-preserving type
> conversions (e.g. ir_unop_bitcast_f2i) to convert all other types to
> ints.
>
> The varying_matches::compute_packing_class() function is updated to
> reflect the fact that varying packing no longer needs to segregate
> varyings of different base types.
>
> Fixes piglit test varying-packing-mixed-types.
> ---
>   src/glsl/link_varyings.cpp         | 20 +++++++----
>   src/glsl/lower_packed_varyings.cpp | 72 +++++++++++++++++++++++++++++++++-----
>   2 files changed, 78 insertions(+), 14 deletions(-)
>
> diff --git a/src/glsl/link_varyings.cpp b/src/glsl/link_varyings.cpp
> index b9c3f5d..5c27f23 100644
> --- a/src/glsl/link_varyings.cpp
> +++ b/src/glsl/link_varyings.cpp
> @@ -777,17 +777,25 @@ varying_matches::store_locations(unsigned producer_base,
>   unsigned
>   varying_matches::compute_packing_class(ir_variable *var)
>   {
> -   /* In this initial implementation we conservatively assume that variables
> -    * can only be packed if their base type (float/int/uint/bool) matches and
> -    * their interpolation and centroid qualifiers match.
> +   /* Without help from the back-end, there is no way to pack together
> +    * variables with different interpolation types, because
> +    * lower_packed_varyings must choose exactly one interpolation type for
> +    * each packed varying it creates.
>       *
> -    * TODO: relax these restrictions when the driver back-end permits.
> +    * However, we can safely pack together floats, ints, and uints, because:
> +    *
> +    * - varyings of base type "int" and "uint" must use the "flat"
> +    *   interpolation type, which can only occur in GLSL 1.30 and above.
> +    *
> +    * - On platforms that support GLSL 1.30 and above, lower_packed_varyings
> +    *   can store flat floats as ints without losing any information (using
> +    *   the ir_unop_bitcast_* opcodes).
> +    *
> +    * Therefore, the packing class depends only on the interpolation type.
>       */
>      unsigned packing_class = var->centroid ? 1 : 0;
>      packing_class *= 4;
>      packing_class += var->interpolation;
> -   packing_class *= GLSL_TYPE_ERROR;
> -   packing_class += var->type->get_scalar_type()->base_type;
>      return packing_class;
>   }
>
> diff --git a/src/glsl/lower_packed_varyings.cpp b/src/glsl/lower_packed_varyings.cpp
> index 09c551c..9cf1075 100644
> --- a/src/glsl/lower_packed_varyings.cpp
> +++ b/src/glsl/lower_packed_varyings.cpp
> @@ -66,6 +66,10 @@
>    * performance.  However, hopefully in most cases the performance loss will
>    * either be absorbed by a later optimization pass, or it will be offset by
>    * memory bandwidth savings (because fewer varyings are used).
> + *
> + * This lowering pass also packs flat floats, ints, and uints together, by
> + * using ivec4 as the base type of flat "varyings", and using appropriate
> + * casts to convert floats and uints into ints.
>    */
>
>   #include "glsl_symbol_table.h"
> @@ -90,6 +94,7 @@ public:
>      void run(exec_list *instructions);
>
>   private:
> +   ir_assignment *bitwise_assign(ir_rvalue *lhs, ir_rvalue *rhs);
>      unsigned lower_rvalue(ir_rvalue *rvalue, unsigned fine_location,
>                            ir_variable *unpacked_var, const char *name);
>      unsigned lower_arraylike(ir_rvalue *rvalue, unsigned array_size,
> @@ -181,6 +186,55 @@ lower_packed_varyings_visitor::run(exec_list *instructions)
>      }
>   }
>
> +
> +/**
> + * Make an ir_assignment from \c rhs to \c lhs, performing appropriate
> + * bitcasts if necessary to match up types.
> + */
> +ir_assignment *
> +lower_packed_varyings_visitor::bitwise_assign(ir_rvalue *lhs, ir_rvalue *rhs)
> +{
> +   if (lhs->type->base_type != rhs->type->base_type) {
> +      /* Since we only mix types in flat varyings, and we always store flat
> +       * varyings as type ivec4, we need only produce conversions to and from
> +       * int.
> +       */
> +      switch (rhs->type->base_type) {
> +      case GLSL_TYPE_UINT:
> +         rhs = new(this->mem_ctx)
> +            ir_expression(ir_unop_u2i, lhs->type, rhs);
> +         break;
> +      case GLSL_TYPE_FLOAT:
> +         rhs = new(this->mem_ctx)
> +            ir_expression(ir_unop_bitcast_f2i, lhs->type, rhs);
> +         break;
> +      case GLSL_TYPE_INT:
> +         switch (lhs->type->base_type) {
> +         case GLSL_TYPE_UINT:
> +            rhs = new(this->mem_ctx)
> +               ir_expression(ir_unop_i2u, lhs->type, rhs);
> +            break;
> +         case GLSL_TYPE_FLOAT:
> +            rhs = new(this->mem_ctx)
> +               ir_expression(ir_unop_bitcast_i2f, lhs->type, rhs);
> +            break;
> +         case GLSL_TYPE_INT:
> +            /* No conversion needed */
> +            break;
> +         default:
> +            assert(!"Unexpected type conversion while lowering varyings");
> +            break;
> +         }
> +         break;
> +      default:
> +         assert(!"Unexpected type conversion while lowering varyings");
> +         break;
> +      }
> +   }
> +   return new(this->mem_ctx) ir_assignment(lhs, rhs);
> +}

I find this function rather difficult to follow.  It does both all the 
conversions necessary for packing -and- unpacking (which I believe are 
non-overlapping sets).  Also, the it's easy to forget about the 
surrounding lhs->type != rhs->type check which effectively restricts 
this to dealing with flat data.

It might be simpler if it was split into separate pack/unpack functions.

Either way, this series gets a:
Reviewed-by: Kenneth Graunke <kenneth at whitecape.org>

> +
>   /**
>    * Recursively pack or unpack the given varying (or portion of a varying) by
>    * traversing all of its constituent vectors.
> @@ -262,12 +316,10 @@ lower_packed_varyings_visitor::lower_rvalue(ir_rvalue *rvalue,
>         ir_swizzle *swizzle = new(this->mem_ctx)
>            ir_swizzle(packed_deref, swizzle_values, components);
>         if (this->mode == ir_var_out) {
> -         ir_assignment *assignment = new(this->mem_ctx)
> -            ir_assignment(swizzle, rvalue);
> +         ir_assignment *assignment = this->bitwise_assign(swizzle, rvalue);
>            this->main_instructions->push_tail(assignment);
>         } else {
> -         ir_assignment *assignment = new(this->mem_ctx)
> -            ir_assignment(rvalue, swizzle);
> +         ir_assignment *assignment = this->bitwise_assign(rvalue, swizzle);
>            this->main_instructions->push_head(assignment);
>         }
>         return fine_location + components;
> @@ -306,8 +358,9 @@ lower_packed_varyings_visitor::lower_arraylike(ir_rvalue *rvalue,
>    * If no packed varying has been created for the given varying location yet,
>    * create it and add it to the shader before returning it.
>    *
> - * The newly created varying inherits its base type (float, uint, or int) and
> - * interpolation parameters from \c unpacked_var.
> + * The newly created varying inherits its interpolation parameters from \c
> + * unpacked_var.  Its base type is ivec4 if we are lowering a flat varying,
> + * vec4 otherwise.
>    */
>   ir_variable *
>   lower_packed_varyings_visitor::get_packed_varying(unsigned location,
> @@ -318,8 +371,11 @@ lower_packed_varyings_visitor::get_packed_varying(unsigned location,
>      assert(slot < locations_used);
>      if (this->packed_varyings[slot] == NULL) {
>         char *packed_name = ralloc_asprintf(this->mem_ctx, "packed:%s", name);
> -      const glsl_type *packed_type = glsl_type::get_instance(
> -            unpacked_var->type->get_scalar_type()->base_type, 4, 1);
> +      const glsl_type *packed_type;
> +      if (unpacked_var->interpolation == INTERP_QUALIFIER_FLAT)
> +         packed_type = glsl_type::ivec4_type;
> +      else
> +         packed_type = glsl_type::vec4_type;
>         ir_variable *packed_var = new(this->mem_ctx)
>            ir_variable(packed_type, packed_name, this->mode);
>         packed_var->centroid = unpacked_var->centroid;
>