[Mesa-dev] [PATCH V4] glsl: disable varying packing when its not safe

Samuel Iglesias Gonsálvez siglesias at igalia.com
Thu Mar 17 11:24:33 UTC 2016


Reviewed-by: Samuel Iglesias Gonsálvez <siglesias at igalia.com>

On 16/03/16 12:30, Timothy Arceri wrote:
> In GL 4.4+ there is no guarantee that interpolation qualifiers will
> match between stages so we cannot safely pack varyings using the
> current packing pass in Mesa.
> 
> We also disable packing on outerward facing interfaces for SSO
> because in ES we need to retain the unpacked varying information
> for draw time validation. For desktop GL we could allow packing for
> SSO in versions < 4.4 but its just safer not to do so.
> 
> We do however enable packing on individual arrays, structs, and
> matrices as these are required by the transform feedback code and it
> is still safe to do so.
> 
> Finally we also enable packing when a varying is only used for
> transform feedback and its not a SSO.
> 
> This fixes all remaining rendering issues with the dEQP SSO tests,
> the only issues remaining with thoses tests are to do with validation.
> 
> Note: There is still one remaining SSO bug that this patch doesn't fix.
> Their is a chance that VS -> TCS will have mismatching interfaces
> because we pack VS output in case its used by transform feedback but
> don't pack TCS input for performance reasons. This patch will make the
> situation better but doesn't fix it.
> 
> V4: fix out of order function params after rebase, make sure packing
> still disabled in tess stages. Update comments as to why we disable
> packing on SSO.
> 
> V3: ES 3.1 *does* require interpolation to match so don't disable
> packing there. Rebased on master rather than on enhanced layouts
> component packing series.
> 
> V2: Make is_varying_packing_safe() a function in the varying_matches
> class, fix spelling (Matt) and make sure to remove the outer array
> when dealing with Geom and Tess shaders where appropriate.
> Lastly fix piglit regression in new piglit test and document the
> undefined behaviour it depends on:
> arb_separate_shader_objects/execution/vs-gs-linking.shader_test
> 
> Cc: Samuel Iglesias Gonsálvez <siglesias at igalia.com>
> Cc: Kenneth Graunke <kenneth at whitecape.org>
> ---
>  src/compiler/glsl/ir.h                      |   7 +
>  src/compiler/glsl/ir_optimization.h         |   2 +-
>  src/compiler/glsl/link_varyings.cpp         | 196 +++++++++++++++++++++-------
>  src/compiler/glsl/lower_packed_varyings.cpp |  28 +++-
>  4 files changed, 180 insertions(+), 53 deletions(-)
> 
> diff --git a/src/compiler/glsl/ir.h b/src/compiler/glsl/ir.h
> index f451967..b74d68a 100644
> --- a/src/compiler/glsl/ir.h
> +++ b/src/compiler/glsl/ir.h
> @@ -720,6 +720,13 @@ public:
>        unsigned is_unmatched_generic_inout:1;
>  
>        /**
> +       * Is this varying used only by transform feedback?
> +       *
> +       * This is used by the linker to decide if its safe to pack the varying.
> +       */
> +      unsigned is_xfb_only:1;
> +
> +      /**
>         * If non-zero, then this variable may be packed along with other variables
>         * into a single varying slot, so this offset should be applied when
>         * accessing components.  For example, an offset of 1 means that the x
> diff --git a/src/compiler/glsl/ir_optimization.h b/src/compiler/glsl/ir_optimization.h
> index 30c95f4..2d77376 100644
> --- a/src/compiler/glsl/ir_optimization.h
> +++ b/src/compiler/glsl/ir_optimization.h
> @@ -125,7 +125,7 @@ void lower_ubo_reference(struct gl_shader *shader);
>  void lower_packed_varyings(void *mem_ctx,
>                             unsigned locations_used, ir_variable_mode mode,
>                             unsigned gs_input_vertices, gl_shader *shader,
> -                           bool disable_varying_packing);
> +                           bool disable_varying_packing, bool xfb_enabled);
>  bool lower_vector_insert(exec_list *instructions, bool lower_nonconstant_index);
>  bool lower_vector_derefs(gl_shader *shader);
>  void lower_named_interface_blocks(void *mem_ctx, gl_shader *shader);
> diff --git a/src/compiler/glsl/link_varyings.cpp b/src/compiler/glsl/link_varyings.cpp
> index 806191b..44fc8f6 100644
> --- a/src/compiler/glsl/link_varyings.cpp
> +++ b/src/compiler/glsl/link_varyings.cpp
> @@ -826,7 +826,7 @@ namespace {
>  class varying_matches
>  {
>  public:
> -   varying_matches(bool disable_varying_packing,
> +   varying_matches(bool disable_varying_packing, bool xfb_enabled,
>                     gl_shader_stage producer_stage,
>                     gl_shader_stage consumer_stage);
>     ~varying_matches();
> @@ -836,14 +836,30 @@ public:
>     void store_locations() const;
>  
>  private:
> +   bool is_varying_packing_safe(const glsl_type *type,
> +                                const ir_variable *var);
> +
>     /**
>      * If true, this driver disables varying packing, so all varyings need to
>      * be aligned on slot boundaries, and take up a number of slots equal to
>      * their number of matrix columns times their array size.
> +    *
> +    * Packing may also be disabled because our current packing method is not
> +    * safe in SSO or versions of OpenGL where interpolation qualifiers are not
> +    * guaranteed to match across stages.
>      */
>     const bool disable_varying_packing;
>  
>     /**
> +    * If true, this driver has transform feedback enabled. The transform
> +    * feedback code requires at least some packing be done even when varying
> +    * packing is disabled, fortunately where transform feedback requires
> +    * packing it's safe to override the disabled setting. See
> +    * is_varying_packing_safe().
> +    */
> +   const bool xfb_enabled;
> +
> +   /**
>      * Enum representing the order in which varyings are packed within a
>      * packing class.
>      *
> @@ -862,6 +878,7 @@ private:
>     static unsigned compute_packing_class(const ir_variable *var);
>     static packing_order_enum compute_packing_order(const ir_variable *var);
>     static int match_comparator(const void *x_generic, const void *y_generic);
> +   static int xfb_comparator(const void *x_generic, const void *y_generic);
>  
>     /**
>      * Structure recording the relationship between a single producer output
> @@ -917,9 +934,11 @@ private:
>  } /* anonymous namespace */
>  
>  varying_matches::varying_matches(bool disable_varying_packing,
> +                                 bool xfb_enabled,
>                                   gl_shader_stage producer_stage,
>                                   gl_shader_stage consumer_stage)
>     : disable_varying_packing(disable_varying_packing),
> +     xfb_enabled(xfb_enabled),
>       producer_stage(producer_stage),
>       consumer_stage(consumer_stage)
>  {
> @@ -942,6 +961,24 @@ varying_matches::~varying_matches()
>  
>  
>  /**
> + * Packing is always safe on individual arrays, structure and matices. It is
> + * also safe if the varying is only used for transform feedback.
> + */
> +bool
> +varying_matches::is_varying_packing_safe(const glsl_type *type,
> +                                         const ir_variable *var)
> +{
> +   if (consumer_stage == MESA_SHADER_TESS_EVAL ||
> +       consumer_stage == MESA_SHADER_TESS_CTRL ||
> +       producer_stage == MESA_SHADER_TESS_CTRL)
> +      return false;
> +
> +   return xfb_enabled && (type->is_array() || type->is_record() ||
> +                          type->is_matrix() || var->data.is_xfb_only);
> +}
> +
> +
> +/**
>   * Record the given producer/consumer variable pair in the list of variables
>   * that should later be assigned locations.
>   *
> @@ -1020,7 +1057,7 @@ varying_matches::record(ir_variable *producer_var, ir_variable *consumer_var)
>        = this->compute_packing_class(var);
>     this->matches[this->num_matches].packing_order
>        = this->compute_packing_order(var);
> -   if (this->disable_varying_packing) {
> +   if (this->disable_varying_packing && !is_varying_packing_safe(type, var)) {
>        unsigned slots = type->count_attribute_slots(false);
>        this->matches[this->num_matches].num_components = slots * 4;
>     } else {
> @@ -1046,37 +1083,28 @@ varying_matches::assign_locations(struct gl_shader_program *prog,
>                                    uint64_t reserved_slots,
>                                    bool separate_shader)
>  {
> -   /* We disable varying sorting for separate shader programs for the
> -    * following reasons:
> -    *
> -    * 1/ All programs must sort the code in the same order to guarantee the
> -    *    interface matching. However varying_matches::record() will change the
> -    *    interpolation qualifier of some stages.
> -    *
> -    * 2/ GLSL version 4.50 removes the matching constrain on the interpolation
> -    *    qualifier.
> -    *
> -    * From Section 4.5 (Interpolation Qualifiers) of the GLSL 4.40 spec:
> -    *
> -    *    "The type and presence of interpolation qualifiers of variables with
> -    *    the same name declared in all linked shaders for the same cross-stage
> -    *    interface must match, otherwise the link command will fail.
> -    *
> -    *    When comparing an output from one stage to an input of a subsequent
> -    *    stage, the input and output don't match if their interpolation
> -    *    qualifiers (or lack thereof) are not the same."
> -    *
> -    *    "It is a link-time error if, within the same stage, the interpolation
> -    *    qualifiers of variables of the same name do not match."
> +   /* If packing has been disabled then we cannot safely sort the varyings by
> +    * class as it may mean we are using a version of OpenGL where
> +    * interpolation qualifiers are not guaranteed to be matching across
> +    * shaders, sorting in this case could result in mismatching shader
> +    * interfaces.
> +    * When packing is disabled the sort orders varyings used by transform
> +    * feedback first, but also depends on *undefined behaviour* of qsort to
> +    * reverse the order of the varyings. See: xfb_comparator().
>      */
> -   if (!separate_shader) {
> +   if (!this->disable_varying_packing) {
>        /* Sort varying matches into an order that makes them easy to pack. */
>        qsort(this->matches, this->num_matches, sizeof(*this->matches),
>              &varying_matches::match_comparator);
> +   } else {
> +      /* Only sort varyings that are only used by transform feedback. */
> +      qsort(this->matches, this->num_matches, sizeof(*this->matches),
> +            &varying_matches::xfb_comparator);
>     }
>  
>     unsigned generic_location = 0;
>     unsigned generic_patch_location = MAX_VARYING*4;
> +   bool previous_var_xfb_only = false;
>  
>     for (unsigned i = 0; i < this->num_matches; i++) {
>        unsigned *location = &generic_location;
> @@ -1100,16 +1128,30 @@ varying_matches::assign_locations(struct gl_shader_program *prog,
>        /* Advance to the next slot if this varying has a different packing
>         * class than the previous one, and we're not already on a slot
>         * boundary.
> +       *
> +       * Also advance to the next slot if packing is disabled. This makes sure
> +       * we don't assign varyings the same locations which is possible
> +       * because we still pack individual arrays, records and matrices even
> +       * when packing is disabled. Note we don't advance to the next slot if
> +       * we can pack varyings together that are only used for transform
> +       * feedback.
>         */
> -      if (i > 0 &&
> -          this->matches[i - 1].packing_class
> -          != this->matches[i].packing_class) {
> +      if ((this->disable_varying_packing &&
> +           !(previous_var_xfb_only && var->data.is_xfb_only)) ||
> +          (i > 0 && this->matches[i - 1].packing_class
> +          != this->matches[i].packing_class )) {
>           *location = ALIGN(*location, 4);
>        }
>  
> +      previous_var_xfb_only = var->data.is_xfb_only;
> +
>        unsigned num_elements =  type->count_attribute_slots(is_vertex_input);
> -      unsigned slot_end = this->disable_varying_packing ? 4 :
> -         type->without_array()->vector_elements;
> +      unsigned slot_end;
> +      if (this->disable_varying_packing &&
> +          !is_varying_packing_safe(type, var))
> +         slot_end = 4;
> +      else
> +         slot_end = type->without_array()->vector_elements;
>        slot_end += *location - 1;
>  
>        /* FIXME: We could be smarter in the below code and loop back over
> @@ -1133,7 +1175,8 @@ varying_matches::assign_locations(struct gl_shader_program *prog,
>           /* Increase the slot to make sure there is enough room for next
>            * array element.
>            */
> -         if (this->disable_varying_packing)
> +         if (this->disable_varying_packing &&
> +             !is_varying_packing_safe(type, var))
>              slot_end += 4;
>           else
>              slot_end += type->without_array()->vector_elements;
> @@ -1259,6 +1302,32 @@ varying_matches::match_comparator(const void *x_generic, const void *y_generic)
>  
>  
>  /**
> + * Comparison function passed to qsort() to sort varyings used only by
> + * transform feedback when packing of other varyings is disabled.
> + */
> +int
> +varying_matches::xfb_comparator(const void *x_generic, const void *y_generic)
> +{
> +   const match *x = (const match *) x_generic;
> +
> +   if (x->producer_var != NULL && x->producer_var->data.is_xfb_only)
> +         return match_comparator(x_generic, y_generic);
> +
> +   /* FIXME: When the comparator returns 0 it means the elements being
> +    * compared are equivalent. However the qsort documentation says:
> +    *
> +    *    "The order of equivalent elements is undefined."
> +    *
> +    * In practice the sort ends up reversing the order of the varyings which
> +    * means locations are also assigned in this reversed order and happens to
> +    * be what we want. This is also whats happening in
> +    * varying_matches::match_comparator().
> +    */
> +   return 0;
> +}
> +
> +
> +/**
>   * Is the given variable a varying variable to be counted against the
>   * limit in ctx->Const.MaxVarying?
>   * This includes variables such as texcoords, colors and generic
> @@ -1573,26 +1642,60 @@ assign_varying_locations(struct gl_context *ctx,
>                           unsigned num_tfeedback_decls,
>                           tfeedback_decl *tfeedback_decls)
>  {
> -   if (ctx->Const.DisableVaryingPacking) {
> -      /* Transform feedback code assumes varyings are packed, so if the driver
> -       * has disabled varying packing, make sure it does not support transform
> -       * feedback.
> -       */
> -      assert(!ctx->Extensions.EXT_transform_feedback);
> -   }
> -
>     /* Tessellation shaders treat inputs and outputs as shared memory and can
>      * access inputs and outputs of other invocations.
>      * Therefore, they can't be lowered to temps easily (and definitely not
>      * efficiently).
>      */
> -   bool disable_varying_packing =
> -      ctx->Const.DisableVaryingPacking ||
> +   bool unpackable_tess =
>        (consumer && consumer->Stage == MESA_SHADER_TESS_EVAL) ||
>        (consumer && consumer->Stage == MESA_SHADER_TESS_CTRL) ||
>        (producer && producer->Stage == MESA_SHADER_TESS_CTRL);
>  
> -   varying_matches matches(disable_varying_packing,
> +   /* Transform feedback code assumes varying arrays are packed, so if the
> +    * driver has disabled varying packing, make sure to at least enable
> +    * packing required by transform feedback.
> +    */
> +   bool xfb_enabled =
> +      ctx->Extensions.EXT_transform_feedback && !unpackable_tess;
> +
> +   /* Disable varying packing for GL 4.4+ as there is no guarantee
> +    * that interpolation qualifiers will match between shaders in these
> +    * versions. We also disable packing on outerward facing interfaces for
> +    * SSO because in ES we need to retain the unpacked varying information
> +    * for draw time validation. For desktop GL we could allow packing for
> +    * versions < 4.4 but its just safer not to do packing.
> +    *
> +    * Packing is still enabled on individual arrays, structs, and matrices as
> +    * these are required by the transform feedback code and it is still safe
> +    * to do so. We also enable packing when a varying is only used for
> +    * transform feedback and its not a SSO.
> +    *
> +    * Varying packing currently only packs together varyings with matching
> +    * interpolation qualifiers as the backends assume all packed components
> +    * are to be processed in the same way. Therefore we cannot do packing in
> +    * these versions of GL without the risk of mismatching interfaces.
> +    *
> +    * From Section 4.5 (Interpolation Qualifiers) of the GLSL 4.30 spec:
> +    *
> +    *    "The type and presence of interpolation qualifiers of variables with
> +    *    the same name declared in all linked shaders for the same cross-stage
> +    *    interface must match, otherwise the link command will fail.
> +    *
> +    *    When comparing an output from one stage to an input of a subsequent
> +    *    stage, the input and output don't match if their interpolation
> +    *    qualifiers (or lack thereof) are not the same."
> +    *
> +    * This text was also in at least revison 7 of the 4.40 spec but is no
> +    * longer in revision 9 and not in the 4.50 spec.
> +    */
> +   bool disable_varying_packing =
> +      ctx->Const.DisableVaryingPacking || unpackable_tess;
> +   if ((ctx->API == API_OPENGL_CORE && ctx->Version >= 44) ||
> +       (prog->SeparateShader && (producer == NULL || consumer == NULL)))
> +      disable_varying_packing = true;
> +
> +   varying_matches matches(disable_varying_packing, xfb_enabled,
>                             producer ? producer->Stage : (gl_shader_stage)-1,
>                             consumer ? consumer->Stage : (gl_shader_stage)-1);
>     hash_table *tfeedback_candidates
> @@ -1711,8 +1814,10 @@ assign_varying_locations(struct gl_context *ctx,
>           return false;
>        }
>  
> -      if (matched_candidate->toplevel_var->data.is_unmatched_generic_inout)
> +      if (matched_candidate->toplevel_var->data.is_unmatched_generic_inout) {
> +         matched_candidate->toplevel_var->data.is_xfb_only = 1;
>           matches.record(matched_candidate->toplevel_var, NULL);
> +      }
>     }
>  
>     const uint64_t reserved_slots =
> @@ -1786,13 +1891,14 @@ assign_varying_locations(struct gl_context *ctx,
>  
>     if (producer) {
>        lower_packed_varyings(mem_ctx, slots_used, ir_var_shader_out,
> -                            0, producer, disable_varying_packing);
> +                            0, producer, disable_varying_packing,
> +                            xfb_enabled);
>     }
>  
>     if (consumer) {
>        lower_packed_varyings(mem_ctx, slots_used, ir_var_shader_in,
>                              consumer_vertices, consumer,
> -                            disable_varying_packing);
> +                            disable_varying_packing, xfb_enabled);
>     }
>  
>     return true;
> diff --git a/src/compiler/glsl/lower_packed_varyings.cpp b/src/compiler/glsl/lower_packed_varyings.cpp
> index d91aa22..825cc9e 100644
> --- a/src/compiler/glsl/lower_packed_varyings.cpp
> +++ b/src/compiler/glsl/lower_packed_varyings.cpp
> @@ -169,7 +169,8 @@ public:
>                                   unsigned gs_input_vertices,
>                                   exec_list *out_instructions,
>                                   exec_list *out_variables,
> -                                 bool disable_varying_packing);
> +                                 bool disable_varying_packing,
> +                                 bool xfb_enabled);
>  
>     void run(struct gl_shader *shader);
>  
> @@ -234,6 +235,7 @@ private:
>     exec_list *out_variables;
>  
>     bool disable_varying_packing;
> +   bool xfb_enabled;
>  };
>  
>  } /* anonymous namespace */
> @@ -241,7 +243,8 @@ private:
>  lower_packed_varyings_visitor::lower_packed_varyings_visitor(
>        void *mem_ctx, unsigned locations_used, ir_variable_mode mode,
>        unsigned gs_input_vertices, exec_list *out_instructions,
> -      exec_list *out_variables, bool disable_varying_packing)
> +      exec_list *out_variables, bool disable_varying_packing,
> +      bool xfb_enabled)
>     : mem_ctx(mem_ctx),
>       locations_used(locations_used),
>       packed_varyings((ir_variable **)
> @@ -251,7 +254,8 @@ lower_packed_varyings_visitor::lower_packed_varyings_visitor(
>       gs_input_vertices(gs_input_vertices),
>       out_instructions(out_instructions),
>       out_variables(out_variables),
> -     disable_varying_packing(disable_varying_packing)
> +     disable_varying_packing(disable_varying_packing),
> +     xfb_enabled(xfb_enabled)
>  {
>  }
>  
> @@ -660,10 +664,18 @@ lower_packed_varyings_visitor::needs_lowering(ir_variable *var)
>     if (var->data.explicit_location)
>        return false;
>  
> -   if (disable_varying_packing)
> +   /* Override disable_varying_packing if the var is only used by transform
> +    * feedback. Also override it if transform feedback is enabled and the
> +    * variable is an array, struct or matrix as the elements of these types
> +    * will always has the same interpolation and therefore asre safe to pack.
> +    */
> +   const glsl_type *type = var->type;
> +   if (disable_varying_packing && !var->data.is_xfb_only &&
> +       !((type->is_array() || type->is_record() || type->is_matrix()) &&
> +         xfb_enabled))
>        return false;
>  
> -   const glsl_type *type = var->type->without_array();
> +   type = type->without_array();
>     if (type->vector_elements == 4 && !type->is_double())
>        return false;
>     return true;
> @@ -716,7 +728,8 @@ lower_packed_varyings_gs_splicer::visit_leave(ir_emit_vertex *ev)
>  void
>  lower_packed_varyings(void *mem_ctx, unsigned locations_used,
>                        ir_variable_mode mode, unsigned gs_input_vertices,
> -                      gl_shader *shader, bool disable_varying_packing)
> +                      gl_shader *shader, bool disable_varying_packing,
> +                      bool xfb_enabled)
>  {
>     exec_list *instructions = shader->ir;
>     ir_function *main_func = shader->symbols->get_function("main");
> @@ -728,7 +741,8 @@ lower_packed_varyings(void *mem_ctx, unsigned locations_used,
>                                           gs_input_vertices,
>                                           &new_instructions,
>                                           &new_variables,
> -                                         disable_varying_packing);
> +                                         disable_varying_packing,
> +                                         xfb_enabled);
>     visitor.run(shader);
>     if (mode == ir_var_shader_out) {
>        if (shader->Stage == MESA_SHADER_GEOMETRY) {
> 


More information about the mesa-dev mailing list