[Mesa-dev] [PATCH v2] glsls: Modify exec_list to avoid strict-aliasing violations

Davin McCall davmac at davmac.org
Mon Jun 29 10:47:26 PDT 2015


On 29/06/15 13:26, Francisco Jerez wrote:
> Davin McCall <davmac at davmac.org> writes:
>
>> On 29/06/15 10:40, Francisco Jerez wrote:
>>> Davin McCall <davmac at davmac.org> writes:
>>>
>>>> On 26/06/15 14:53, Francisco Jerez wrote:
>>>>
>>>>> [...]
>>>>>
>>>>> Your first approach seemed quite reasonable IMHO.  Were you able to
>>>>> measure any performance regression from it?
>>>>>
>>>>> Thanks.
>>>>>
>>>> When I run an apitrace replay of a Dota 2 trace [1] with
>>>> LIBGL_ALWAYS_SOFTWARE and without the patch I get (averaged over 5 runs):
>>>>
>>>>        Maximum Resident Set Size (kbytes): 4509696
>>>>        FPS: .9044752
>>>>        user time: 2467.306
>>>>
>>>> ("Maximum Resident Set Size" and user time are given by GNU "time". I'm
>>>> not sure what it's really measuring, because this is a 32-bit system and
>>>> I don't see how the maximum resident set could be > 4GB; "top" shows
>>>> virt+res capping out at about 2.3GB. However I assume MRSS is at least
>>>> giving some relative indication of memory use; the deviation wasn't too
>>>> high).
>>>>
>>>> With the patch (again averaged over 5 runs):
>>>>
>>>>        Maximum Resident Set Size: 4523622.4
>>>>        FPS: 0.9068524
>>>>        user time: 2457.506
>>>>
>>>> So, "MRSS" has gone up a bit, but nothing else has changed
>>>> significantly. I think that means memory use has slightly increased, but
>>>> performance hasn't really changed.
>>>>
>>>> I wanted to test with the Intel driver using INTEL_NO_HW, but I get a
>>>> segfault when the patch is applied. Having checked over the patch
>>>> several times, I think this might mean that it triggers a latent bug
>>>> elsewhere, but I am still investigating that. V2 of the patch does not
>>>> trigger this crash.
>>>>
>>> Most likely some assumption left to fix in the i965 back-end?
>> That's what I thought. However, I've just tried (after reverting the
>> patch) inserting a single field in the exec_list structure to emulate
>> the data layout that should occur when the patch is applied - but I
>> can't then reproduce the problem. So I guess there is something wrong
>> with the patch itself, but I'm blind to it :(
>>
>>>     Have you
>>> shared your changes to the i965 driver already?  They don't seem to be
>>> part of your v1.
>> No, I've not shared them previously, but I've included them now (below).
>>
>> Davin
>>
>>
>>
>>   From 2b6ebbb7787a78d55ba46de2f78eb2ba20b9fe58 Mon Sep 17 00:00:00 2001
>> From: Davin McCall <davmac at davmac.org>
>> Date: Sat, 27 Jun 2015 13:48:41 +0100
>> Subject: [PATCH] glsl: fix some strict aliasing issues in exec_list
>>
>> There is a problem in exec_list due to it directly including a trio
>> of 'struct exec_node *' members to implement two overlapping sentinel
>> nodes. The sentinel nodes do not exist as exec_node objects and so
>> should not be accessed as such, according to C99 6.5 paragraph 7.
>> When this strict aliasing rule is violated the compiler may re-order
>> reads and writes in unexpected ways. The problem only manifests if
>> compiling without -fno-strict-aliasing, since that option allows
>> breaking the strict aliasing rules.
>>
>> This patch addresses the issue by including explicit head and tail
>> sentinel nodes into the exec_list structure, which do not overlap.
>> This adds a single word of storage to the size of the exec_list structure.
>>
>> While I'm not proposing that -fno-strict-aliasing no longer be used for
>> Mesa builds, this patch represents a step in that direction. With this
>> patch applied, a working Mesa library can be built, although bugs may
>> be present (and could be triggered only when using particular compiler
>> versions and options on particular platforms).
>> ---
>>    src/glsl/ast_function.cpp                  |  20 +++--
>>    src/glsl/ast_to_hir.cpp                    |   9 +-
>>    src/glsl/glsl_parser_extras.cpp            |   6 +-
>>    src/glsl/ir.cpp                            |   8 +-
>>    src/glsl/ir_clone.cpp                      |   2 +-
>>    src/glsl/ir_constant_expression.cpp        |   3 +-
>>    src/glsl/ir_function.cpp                   |  14 ++--
>>    src/glsl/ir_reader.cpp                     |   5 +-
>>    src/glsl/ir_validate.cpp                   |   5 +-
>>    src/glsl/list.h                            | 130
>> +++++++++++++----------------
>>    src/glsl/lower_clip_distance.cpp           |   6 +-
>>    src/glsl/lower_jumps.cpp                   |   2 +-
>>    src/glsl/lower_packed_varyings.cpp         |   8 +-
>>    src/glsl/nir/nir.c                         |   6 +-
>>    src/glsl/nir/nir_opt_gcm.c                 |   2 +-
>>    src/glsl/opt_conditional_discard.cpp       |   8 +-
>>    src/glsl/opt_constant_variable.cpp         |   3 +-
>>    src/glsl/opt_dead_builtin_varyings.cpp     |   2 +-
>>    src/glsl/opt_flatten_nested_if_blocks.cpp  |   3 +-
>>    src/mesa/drivers/dri/i965/brw_cfg.h        |  12 +--
>>    src/mesa/drivers/dri/i965/brw_fs_builder.h |   3 +-
>>    21 files changed, 131 insertions(+), 126 deletions(-)
>>
>> diff --git a/src/glsl/ast_function.cpp b/src/glsl/ast_function.cpp
>> index 92e26bf..348e615 100644
>> --- a/src/glsl/ast_function.cpp
>> +++ b/src/glsl/ast_function.cpp
>> @@ -152,8 +152,8 @@ verify_parameter_modes(_mesa_glsl_parse_state *state,
>>                   exec_list &actual_ir_parameters,
>>                   exec_list &actual_ast_parameters)
>>    {
>> -   exec_node *actual_ir_node  = actual_ir_parameters.head;
>> -   exec_node *actual_ast_node = actual_ast_parameters.head;
>> +   exec_node *actual_ir_node  = actual_ir_parameters.head_sentinel.next;
>> +   exec_node *actual_ast_node = actual_ast_parameters.head_sentinel.next;
>>
>>       foreach_in_list(const ir_variable, formal, &sig->parameters) {
>>          /* The lists must be the same length. */
>> @@ -964,7 +964,7 @@ constant_record_constructor(const glsl_type
>> *constructor_type,
>>    bool
>>    single_scalar_parameter(exec_list *parameters)
>>    {
>> -   const ir_rvalue *const p = (ir_rvalue *) parameters->head;
>> +   const ir_rvalue *const p = (ir_rvalue *) parameters->head_sentinel.next;
>>       assert(((ir_rvalue *)p)->as_rvalue() != NULL);
>>
>>       return (p->type->is_scalar() && p->next->is_tail_sentinel());
>> @@ -1008,7 +1008,7 @@ emit_inline_vector_constructor(const glsl_type *type,
>>        */
>>       const unsigned lhs_components = type->components();
>>       if (single_scalar_parameter(parameters)) {
>> -      ir_rvalue *first_param = (ir_rvalue *)parameters->head;
>> +      ir_rvalue *first_param = (ir_rvalue *)parameters->head_sentinel.next;
>>          ir_rvalue *rhs = new(ctx) ir_swizzle(first_param, 0, 0, 0, 0,
>>                           lhs_components);
>>          ir_dereference_variable *lhs = new(ctx)
>> ir_dereference_variable(var);
>> @@ -1209,7 +1209,8 @@ emit_inline_matrix_constructor(const glsl_type *type,
>>        *    to the upper left portion of the constructed matrix, and the
>> remaining
>>        *    elements take values from the identity matrix.
>>        */
>> -   ir_rvalue *const first_param = (ir_rvalue *) parameters->head;
>> +   ir_rvalue *const first_param =
>> +      (ir_rvalue *) parameters->head_sentinel.next;
>>       if (single_scalar_parameter(parameters)) {
>>          /* Assign the scalar to the X component of a vec4, and fill the
>> remaining
>>           * components with zero.
>> @@ -1454,7 +1455,7 @@ emit_inline_record_constructor(const glsl_type *type,
>>
>>       instructions->push_tail(var);
>>
>> -   exec_node *node = parameters->head;
>> +   exec_node *node = parameters->head_sentinel.next;
>>       for (unsigned i = 0; i < type->length; i++) {
>>          assert(!node->is_tail_sentinel());
>>
>> @@ -1487,7 +1488,7 @@ process_record_constructor(exec_list *instructions,
>>       process_parameters(instructions, &actual_parameters,
>>                          parameters, state);
>>
>> -   exec_node *node = actual_parameters.head;
>> +   exec_node *node = actual_parameters.head_sentinel.next;
>>       for (unsigned i = 0; i < constructor_type->length; i++) {
>>          ir_rvalue *ir = (ir_rvalue *) node;
>>
>> @@ -1751,8 +1752,9 @@ ast_function_expression::hir(exec_list *instructions,
>>          if (all_parameters_are_constant) {
>>         return new(ctx) ir_constant(constructor_type, &actual_parameters);
>>          } else if (constructor_type->is_scalar()) {
>> -     return dereference_component((ir_rvalue *) actual_parameters.head,
>> -                      0);
>> +     return dereference_component(
>> +           (ir_rvalue *) actual_parameters.head_sentinel.next,
>> +           0);
>>          } else if (constructor_type->is_vector()) {
>>         return emit_inline_vector_constructor(constructor_type,
>>                               instructions,
>> diff --git a/src/glsl/ast_to_hir.cpp b/src/glsl/ast_to_hir.cpp
>> index 8cb46be..7452482 100644
>> --- a/src/glsl/ast_to_hir.cpp
>> +++ b/src/glsl/ast_to_hir.cpp
>> @@ -1761,7 +1761,7 @@ ast_expression::do_hir(exec_list *instructions,
>>              * effect, but I don't think these cases exist in GLSL.
>> Either way,
>>              * it would be a giant hassle to replicate that behavior.
>>              */
>> -         if (previous_tail_pred == instructions->tail_pred) {
>> +         if (previous_tail_pred == instructions->tail_sentinel.prev) {
>>                _mesa_glsl_warning(&previous_operand_loc, state,
>>                                   "left-hand operand of comma expression
>> has "
>>                                   "no effect");
>> @@ -1772,7 +1772,7 @@ ast_expression::do_hir(exec_list *instructions,
>>              * return NULL when the list is empty.  We don't care about that
>>              * here, so using tail_pred directly is fine.
>>              */
>> -         previous_tail_pred = instructions->tail_pred;
>> +         previous_tail_pred = instructions->tail_sentinel.prev;
>>             previous_operand_loc = ast->get_location();
>>
>>             result = ast->hir(instructions, state);
>> @@ -1923,8 +1923,9 @@ process_array_type(YYLTYPE *loc, const glsl_type
>> *base,
>>             }
>>          }
>>
>> -      for (exec_node *node = array_specifier->array_dimensions.tail_pred;
>> -           !node->is_head_sentinel(); node = node->prev) {
>> +      for (exec_node *node =
>> + array_specifier->array_dimensions.tail_sentinel.prev;
>> +            !node->is_head_sentinel(); node = node->prev) {
>>             unsigned array_size = process_array_size(node, state);
>>             array_type = glsl_type::get_array_instance(array_type,
>> array_size);
>>          }
>> diff --git a/src/glsl/glsl_parser_extras.cpp
>> b/src/glsl/glsl_parser_extras.cpp
>> index 046d5d7..91079ec 100644
>> --- a/src/glsl/glsl_parser_extras.cpp
>> +++ b/src/glsl/glsl_parser_extras.cpp
>> @@ -783,7 +783,7 @@ _mesa_ast_set_aggregate_type(const glsl_type *type,
>>           * E.g., if <type> if struct S[2] we want to set each element's
>> type to
>>           * struct S.
>>           */
>> -      for (exec_node *expr_node = ai->expressions.head;
>> +      for (exec_node *expr_node = ai->expressions.head_sentinel.next;
>>               !expr_node->is_tail_sentinel();
>>               expr_node = expr_node->next) {
>>             ast_expression *expr = exec_node_data(ast_expression, expr_node,
>> @@ -795,7 +795,7 @@ _mesa_ast_set_aggregate_type(const glsl_type *type,
>>
>>       /* If the aggregate is a struct, recursively set its fields' types. */
>>       } else if (type->is_record()) {
>> -      exec_node *expr_node = ai->expressions.head;
>> +      exec_node *expr_node = ai->expressions.head_sentinel.next;
>>
>>          /* Iterate through the struct's fields. */
>>          for (unsigned i = 0; !expr_node->is_tail_sentinel() && i <
>> type->length;
>> @@ -809,7 +809,7 @@ _mesa_ast_set_aggregate_type(const glsl_type *type,
>>          }
>>       /* If the aggregate is a matrix, set its columns' types. */
>>       } else if (type->is_matrix()) {
>> -      for (exec_node *expr_node = ai->expressions.head;
>> +      for (exec_node *expr_node = ai->expressions.head_sentinel.next;
>>               !expr_node->is_tail_sentinel();
>>               expr_node = expr_node->next) {
>>             ast_expression *expr = exec_node_data(ast_expression, expr_node,
>> diff --git a/src/glsl/ir.cpp b/src/glsl/ir.cpp
>> index dbd064f..7b1b396 100644
>> --- a/src/glsl/ir.cpp
>> +++ b/src/glsl/ir.cpp
>> @@ -786,7 +786,7 @@ ir_constant::ir_constant(const struct glsl_type
>> *type, exec_list *value_list)
>>          this->value.u[i] = 0;
>>       }
>>
>> -   ir_constant *value = (ir_constant *) (value_list->head);
>> +   ir_constant *value = (ir_constant *) (value_list->head_sentinel.next);
>>
>>       /* Constructors with exactly one scalar argument are special for
>> vectors
>>        * and matrices.  For vectors, the scalar value is replicated to
>> fill all
>> @@ -1049,7 +1049,7 @@ ir_constant::get_record_field(const char *name)
>>       if (this->components.is_empty())
>>          return NULL;
>>
>> -   exec_node *node = this->components.head;
>> +   exec_node *node = this->components.head_sentinel.next;
>>       for (int i = 0; i < idx; i++) {
>>          node = node->next;
>>
>> @@ -1173,8 +1173,8 @@ ir_constant::has_value(const ir_constant *c) const
>>       }
>>
>>       if (this->type->base_type == GLSL_TYPE_STRUCT) {
>> -      const exec_node *a_node = this->components.head;
>> -      const exec_node *b_node = c->components.head;
>> +      const exec_node *a_node = this->components.head_sentinel.next;
>> +      const exec_node *b_node = c->components.head_sentinel.next;
>>
>>          while (!a_node->is_tail_sentinel()) {
>>         assert(!b_node->is_tail_sentinel());
>> diff --git a/src/glsl/ir_clone.cpp b/src/glsl/ir_clone.cpp
>> index 914e0e4..c1db237 100644
>> --- a/src/glsl/ir_clone.cpp
>> +++ b/src/glsl/ir_clone.cpp
>> @@ -335,7 +335,7 @@ ir_constant::clone(void *mem_ctx, struct hash_table
>> *ht) const
>>          ir_constant *c = new(mem_ctx) ir_constant;
>>
>>          c->type = this->type;
>> -      for (exec_node *node = this->components.head
>> +      for (exec_node *node = this->components.head_sentinel.next
>>              ; !node->is_tail_sentinel()
>>              ; node = node->next) {
>>         ir_constant *const orig = (ir_constant *) node;
>> diff --git a/src/glsl/ir_constant_expression.cpp
>> b/src/glsl/ir_constant_expression.cpp
>> index 171b8e9..dc0c98c 100644
>> --- a/src/glsl/ir_constant_expression.cpp
>> +++ b/src/glsl/ir_constant_expression.cpp
>> @@ -2114,7 +2114,8 @@
>> ir_function_signature::constant_expression_value(exec_list
>> *actual_parameters, s
>>        * have to use the variable objects from the object with the body,
>>        * but the parameter instanciation on the current object.
>>        */
>> -   const exec_node *parameter_info = origin ? origin->parameters.head :
>> parameters.head;
>> +   const exec_node *parameter_info = origin ?
>> +      origin->parameters.head_sentinel.next :
>> parameters.head_sentinel.next;
>>
>>       foreach_in_list(ir_rvalue, n, actual_parameters) {
>>          ir_constant *constant =
>> n->constant_expression_value(variable_context);
>> diff --git a/src/glsl/ir_function.cpp b/src/glsl/ir_function.cpp
>> index 1319443..8cbd5b0 100644
>> --- a/src/glsl/ir_function.cpp
>> +++ b/src/glsl/ir_function.cpp
>> @@ -43,8 +43,8 @@ static parameter_list_match_t
>>    parameter_lists_match(_mesa_glsl_parse_state *state,
>>                          const exec_list *list_a, const exec_list *list_b)
>>    {
>> -   const exec_node *node_a = list_a->head;
>> -   const exec_node *node_b = list_b->head;
>> +   const exec_node *node_a = list_a->head_sentinel.next;
>> +   const exec_node *node_b = list_b->head_sentinel.next;
>>
>>       /* This is set to true if there is an inexact match requiring an
>> implicit
>>        * conversion. */
>> @@ -221,9 +221,9 @@ is_best_inexact_overload(const exec_list
>> *actual_parameters,
>>          if (*other == sig)
>>             continue;
>>
>> -      const exec_node *node_a = sig->parameters.head;
>> -      const exec_node *node_b = (*other)->parameters.head;
>> -      const exec_node *node_p = actual_parameters->head;
>> +      const exec_node *node_a = sig->parameters.head_sentinel.next;
>> +      const exec_node *node_b = (*other)->parameters.head_sentinel.next;
>> +      const exec_node *node_p = actual_parameters->head_sentinel.next;
>>
>>          bool better_for_some_parameter = false;
>>
>> @@ -365,8 +365,8 @@
>> ir_function::matching_signature(_mesa_glsl_parse_state *state,
>>    static bool
>>    parameter_lists_match_exact(const exec_list *list_a, const exec_list
>> *list_b)
>>    {
>> -   const exec_node *node_a = list_a->head;
>> -   const exec_node *node_b = list_b->head;
>> +   const exec_node *node_a = list_a->head_sentinel.next;
>> +   const exec_node *node_b = list_b->head_sentinel.next;
>>
>>       for (/* empty */
>>        ; !node_a->is_tail_sentinel() && !node_b->is_tail_sentinel()
>> diff --git a/src/glsl/ir_reader.cpp b/src/glsl/ir_reader.cpp
>> index 4eae413..90c4b29 100644
>> --- a/src/glsl/ir_reader.cpp
>> +++ b/src/glsl/ir_reader.cpp
>> @@ -208,7 +208,8 @@ ir_reader::read_function(s_expression *expr, bool
>> skip_body)
>>       /* Skip over "function" tag and function name (which are guaranteed
>> to be
>>        * present by the above PARTIAL_MATCH call).
>>        */
>> -   exec_node *node = ((s_list *) expr)->subexpressions.head->next->next;
>> +   exec_node *node =
>> +      ((s_list *) expr)->subexpressions.head_sentinel.next->next->next;
>>       for (/* nothing */; !node->is_tail_sentinel(); node = node->next) {
>>          s_expression *s_sig = (s_expression *) node;
>>          read_function_sig(f, s_sig, skip_body);
>> @@ -251,7 +252,7 @@ ir_reader::read_function_sig(ir_function *f,
>> s_expression *expr, bool skip_body)
>>       state->symbols->push_scope();
>>
>>       /* Skip over the "parameters" tag. */
>> -   exec_node *node = paramlist->subexpressions.head->next;
>> +   exec_node *node = paramlist->subexpressions.head_sentinel.next->next;
>>       for (/* nothing */; !node->is_tail_sentinel(); node = node->next) {
>>          ir_variable *var = read_declaration((s_expression *) node);
>>          if (var == NULL)
>> diff --git a/src/glsl/ir_validate.cpp b/src/glsl/ir_validate.cpp
>> index cfe0df3..f5efcc5 100644
>> --- a/src/glsl/ir_validate.cpp
>> +++ b/src/glsl/ir_validate.cpp
>> @@ -842,8 +842,9 @@ ir_validate::visit_enter(ir_call *ir)
>>          abort();
>>       }
>>
>> -   const exec_node *formal_param_node = callee->parameters.head;
>> -   const exec_node *actual_param_node = ir->actual_parameters.head;
>> +   const exec_node *formal_param_node =
>> callee->parameters.head_sentinel.next;
>> +   const exec_node *actual_param_node =
>> +         ir->actual_parameters.head_sentinel.next;
>>       while (true) {
>>          if (formal_param_node->is_tail_sentinel()
>>              != actual_param_node->is_tail_sentinel()) {
>> diff --git a/src/glsl/list.h b/src/glsl/list.h
>> index 15fcd4a..ee9145e 100644
>> --- a/src/glsl/list.h
>> +++ b/src/glsl/list.h
>> @@ -32,24 +32,8 @@
>>     *
>>     * A list is empty if either the head sentinel's \c next pointer
>> points to the
>>     * tail sentinel or the tail sentinel's \c prev poiner points to the head
>> - * sentinel.
>> - *
>> - * Instead of tracking two separate \c node structures and a \c list
>> structure
>> - * that points to them, the sentinel nodes are in a single structure.
>> Noting
>> - * that each sentinel node always has one \c NULL pointer, the \c NULL
>> - * pointers occupy the same memory location.  In the \c list structure
>> - * contains a the following:
>> - *
>> - *   - A \c head pointer that represents the \c next pointer of the
>> - *     head sentinel node.
>> - *   - A \c tail pointer that represents the \c prev pointer of the head
>> - *     sentinel node and the \c next pointer of the tail sentinel
>> node.  This
>> - *     pointer is \b always \c NULL.
>> - *   - A \c tail_prev pointer that represents the \c prev pointer of the
>> - *     tail sentinel node.
>> - *
>> - * Therefore, if \c head->next is \c NULL or \c tail_prev->prev is \c NULL,
>> - * the list is empty.
>> + * sentinel. The head sentinel and tail sentinel nodes are allocated within
>> + * the list structure.
>>     *
>>     * Do note that this means that the list nodes will contain pointers
>> into the
>>     * list structure itself and as a result you may not \c realloc() an  \c
>> @@ -307,9 +291,8 @@ struct exec_node;
>>    #endif
>>
>>    struct exec_list {
>> -   struct exec_node *head;
>> -   struct exec_node *tail;
>> -   struct exec_node *tail_pred;
>> +   struct exec_node head_sentinel;
>> +   struct exec_node tail_sentinel;
>>
>>    #ifdef __cplusplus
>>       DECLARE_RALLOC_CXX_OPERATORS(exec_list)
>> @@ -366,9 +349,10 @@ struct exec_list {
>>    static inline void
>>    exec_list_make_empty(struct exec_list *list)
>>    {
>> -   list->head = (struct exec_node *) & list->tail;
>> -   list->tail = NULL;
>> -   list->tail_pred = (struct exec_node *) & list->head;
>> +   list->head_sentinel.next = & list->tail_sentinel;
>> +   list->head_sentinel.prev = NULL;
>> +   list->tail_sentinel.next = NULL;
>> +   list->tail_sentinel.prev = & list->head_sentinel;
>>    }
>>
>>    static inline bool
>> @@ -376,39 +360,39 @@ exec_list_is_empty(const struct exec_list *list)
>>    {
>>       /* There are three ways to test whether a list is empty or not.
>>        *
>> -    * - Check to see if the \c head points to the \c tail.
>> -    * - Check to see if the \c tail_pred points to the \c head.
>> -    * - Check to see if the \c head is the sentinel node by test
>> whether its
>> +    * - Check to see if the head sentinel's \c next is the tail sentinel.
>> +    * - Check to see if the tail sentinel's \c prev is the head sentinel.
>> +    * - Check to see if the head is the sentinel node by test whether its
>>        *   \c next pointer is \c NULL.
>>        *
>>        * The first two methods tend to generate better code on modern systems
>>        * because they save a pointer dereference.
>>        */
>> -   return list->head == (struct exec_node *) &list->tail;
>> +   return list->head_sentinel.next == & list->tail_sentinel;
>>    }
>>
>>    static inline const struct exec_node *
>>    exec_list_get_head_const(const struct exec_list *list)
>>    {
>> -   return !exec_list_is_empty(list) ? list->head : NULL;
>> +   return !exec_list_is_empty(list) ? list->head_sentinel.next : NULL;
>>    }
>>
>>    static inline struct exec_node *
>>    exec_list_get_head(struct exec_list *list)
>>    {
>> -   return !exec_list_is_empty(list) ? list->head : NULL;
>> +   return !exec_list_is_empty(list) ? list->head_sentinel.next : NULL;
>>    }
>>
>>    static inline const struct exec_node *
>>    exec_list_get_tail_const(const struct exec_list *list)
>>    {
>> -   return !exec_list_is_empty(list) ? list->tail_pred : NULL;
>> +   return !exec_list_is_empty(list) ? list->tail_sentinel.prev : NULL;
>>    }
>>
>>    static inline struct exec_node *
>>    exec_list_get_tail(struct exec_list *list)
>>    {
>> -   return !exec_list_is_empty(list) ? list->tail_pred : NULL;
>> +   return !exec_list_is_empty(list) ? list->tail_sentinel.prev : NULL;
>>    }
>>
>>    static inline unsigned
>> @@ -417,7 +401,8 @@ exec_list_length(const struct exec_list *list)
>>       unsigned size = 0;
>>       struct exec_node *node;
>>
>> -   for (node = list->head; node->next != NULL; node = node->next) {
>> +   for (node = list->head_sentinel.next; node->next != NULL;
>> +         node = node->next) {
>>          size++;
>>       }
>>
>> @@ -427,21 +412,21 @@ exec_list_length(const struct exec_list *list)
>>    static inline void
>>    exec_list_push_head(struct exec_list *list, struct exec_node *n)
>>    {
>> -   n->next = list->head;
>> -   n->prev = (struct exec_node *) &list->head;
>> +   n->next = list->head_sentinel.next;
>> +   n->prev = &list->head_sentinel;
>>
>>       n->next->prev = n;
>> -   list->head = n;
>> +   list->head_sentinel.next = n;
>>    }
>>
>>    static inline void
>>    exec_list_push_tail(struct exec_list *list, struct exec_node *n)
>>    {
>> -   n->next = (struct exec_node *) &list->tail;
>> -   n->prev = list->tail_pred;
>> +   n->next = &list->tail_sentinel;
>> +   n->prev = list->tail_sentinel.prev;
>>
>>       n->prev->next = n;
>> -   list->tail_pred = n;
>> +   list->tail_sentinel.prev = n;
>>    }
>>
>>    static inline void
>> @@ -449,10 +434,10 @@ exec_list_push_degenerate_list_at_head(struct
>> exec_list *list, struct exec_node
>>    {
>>       assert(n->prev->next == n);
>>
>> -   n->prev->next = list->head;
>> -   list->head->prev = n->prev;
>> -   n->prev = (struct exec_node *) &list->head;
>> -   list->head = n;
>> +   n->prev->next = list->head_sentinel.next;
>> +   list->head_sentinel.next->prev = n->prev;
>> +   n->prev = &list->head_sentinel;
>> +   list->head_sentinel.next = n;
>>    }
>>
>>    static inline struct exec_node *
>> @@ -471,12 +456,13 @@ exec_list_move_nodes_to(struct exec_list *list,
>> struct exec_list *target)
>>       if (exec_list_is_empty(list)) {
>>          exec_list_make_empty(target);
>>       } else {
>> -      target->head = list->head;
>> -      target->tail = NULL;
>> -      target->tail_pred = list->tail_pred;
>> +      target->head_sentinel.next = list->head_sentinel.next;
>> +      target->head_sentinel.prev = NULL;
>> +      target->tail_sentinel.next = NULL;
>> +      target->tail_sentinel.prev = list->tail_sentinel.prev;
>>
>> -      target->head->prev = (struct exec_node *) &target->head;
>> -      target->tail_pred->next = (struct exec_node *) &target->tail;
>> +      target->head_sentinel.next->prev = &target->head_sentinel;
>> +      target->tail_sentinel.prev->next = &target->tail_sentinel;
>>
>>          exec_list_make_empty(list);
>>       }
>> @@ -490,13 +476,13 @@ exec_list_append(struct exec_list *list, struct
>> exec_list *source)
>>
>>       /* Link the first node of the source with the last node of the
>> target list.
>>        */
>> -   list->tail_pred->next = source->head;
>> -   source->head->prev = list->tail_pred;
>> +   list->tail_sentinel.prev->next = source->head_sentinel.next;
>> +   source->head_sentinel.next->prev = list->tail_sentinel.prev;
>>
>>       /* Make the tail of the source list be the tail of the target list.
>>        */
>> -   list->tail_pred = source->tail_pred;
>> -   list->tail_pred->next = (struct exec_node *) &list->tail;
>> +   list->tail_sentinel.prev = source->tail_sentinel.prev;
>> +   list->tail_sentinel.prev->next = &list->tail_sentinel;
>>
>>       /* Make the source list empty for good measure.
>>        */
>> @@ -516,11 +502,11 @@ exec_node_insert_list_before(struct exec_node *n,
>> struct exec_list *before)
>>       if (exec_list_is_empty(before))
>>          return;
>>
>> -   before->tail_pred->next = n;
>> -   before->head->prev = n->prev;
>> +   before->tail_sentinel.prev->next = n;
>> +   before->head_sentinel.next->prev = n->prev;
>>
>> -   n->prev->next = before->head;
>> -   n->prev = before->tail_pred;
>> +   n->prev->next = before->head_sentinel.next;
>> +   n->prev = before->tail_sentinel.prev;
>>
>>       exec_list_make_empty(before);
>>    }
>> @@ -530,15 +516,17 @@ exec_list_validate(const struct exec_list *list)
>>    {
>>       const struct exec_node *node;
>>
>> -   assert(list->head->prev == (const struct exec_node *) &list->head);
>> -   assert(list->tail == NULL);
>> -   assert(list->tail_pred->next == (const struct exec_node *) &list->tail);
>> +   assert(list->head_sentinel.next->prev == &list->head_sentinel);
>> +   assert(list->head_sentinel.prev == NULL);
>> +   assert(list->tail_sentinel.next == NULL);
>> +   assert(list->tail_sentinel.prev->next == &list->tail_sentinel);
>>
>>       /* We could try to use one of the interators below for this but
>> they all
>>        * either require C++ or assume the exec_node is embedded in a
>> structure
>>        * which is not the case for this function.
>>        */
>> -   for (node = list->head; node->next != NULL; node = node->next) {
>> +   for (node = list->head_sentinel.next; node->next != NULL;
>> +         node = node->next) {
>>          assert(node->next->prev == node);
>>          assert(node->prev->next == node);
>>       }
>> @@ -622,12 +610,12 @@ inline void exec_node::insert_before(exec_list
>> *before)
>>    #endif
>>
>>    #define foreach_in_list(__type, __inst, __list)      \
>> -   for (__type *(__inst) = (__type *)(__list)->head; \
>> +   for (__type *(__inst) = (__type *)(__list)->head_sentinel.next; \
>>            !(__inst)->is_tail_sentinel();               \
>>            (__inst) = (__type *)(__inst)->next)
>>
>>    #define foreach_in_list_reverse(__type, __inst, __list)   \
>> -   for (__type *(__inst) = (__type *)(__list)->tail_pred; \
>> +   for (__type *(__inst) = (__type *)(__list)->tail_sentinel.prev; \
>>            !(__inst)->is_head_sentinel();                    \
>>            (__inst) = (__type *)(__inst)->prev)
>>
>> @@ -635,20 +623,20 @@ inline void exec_node::insert_before(exec_list
>> *before)
>>     * This version is safe even if the current node is removed.
>>     */
>>    #define foreach_in_list_safe(__type, __node, __list) \
>> -   for (__type *__node = (__type *)(__list)->head,   \
>> +   for (__type *__node = (__type *)(__list)->head_sentinel.next,   \
>>                   *__next = (__type *)__node->next;     \
>>            __next != NULL;                              \
>>            __node = __next, __next = (__type *)__next->next)
>>
>>    #define foreach_in_list_reverse_safe(__type, __node, __list) \
>> -   for (__type *__node = (__type *)(__list)->tail_pred,      \
>> +   for (__type *__node = (__type *)(__list)->tail_sentinel.prev,      \
>>                   *__prev = (__type *)__node->prev;             \
>>            __prev != NULL;                                      \
>>            __node = __prev, __prev = (__type *)__prev->prev)
>>
>>    #define foreach_in_list_use_after(__type, __inst, __list) \
>>       __type *(__inst);                                      \
>> -   for ((__inst) = (__type *)(__list)->head;              \
>> +   for ((__inst) = (__type *)(__list)->head_sentinel.next; \
>>            !(__inst)->is_tail_sentinel();                    \
>>            (__inst) = (__type *)(__inst)->next)
>>    /**
>> @@ -657,8 +645,8 @@ inline void exec_node::insert_before(exec_list *before)
>>     * This is safe against either current node being removed or replaced.
>>     */
>>    #define foreach_two_lists(__node1, __list1, __node2, __list2) \
>> -   for (struct exec_node * __node1 = (__list1)->head,         \
>> -                         * __node2 = (__list2)->head,         \
>> +   for (struct exec_node * __node1 = (__list1)->head_sentinel.next, \
>> +                         * __node2 = (__list2)->head_sentinel.next, \
>>                             * __next1 = __node1->next,           \
>>                             * __next2 = __node2->next            \
>>        ; __next1 != NULL && __next2 != NULL                  \
>> @@ -669,19 +657,19 @@ inline void exec_node::insert_before(exec_list
>> *before)
>>
>>    #define foreach_list_typed(__type, __node, __field, __list) \
>>       for (__type * __node =                        \
>> -       exec_node_data(__type, (__list)->head, __field);        \
>> +       exec_node_data(__type, (__list)->head_sentinel.next, __field); \
>>        (__node)->__field.next != NULL;                 \
>>        (__node) = exec_node_data(__type, (__node)->__field.next, __field))
>>
>>    #define foreach_list_typed_reverse(__type, __node, __field,
>> __list)        \
>>       for (__type * __node =                                                \
>> -           exec_node_data(__type, (__list)->tail_pred, __field);        \
>> +           exec_node_data(__type, (__list)->tail_sentinel.prev,
>> __field);  \
>>            (__node)->__field.prev != NULL;                                 \
>>            (__node) = exec_node_data(__type, (__node)->__field.prev,
>> __field))
>>
>>    #define foreach_list_typed_safe(__type, __node, __field,
>> __list)           \
>>       for (__type * __node
>> =                                                  \
>> -           exec_node_data(__type, (__list)->head,
>> __field),                \
>> +           exec_node_data(__type, (__list)->head_sentinel.next,
>> __field),  \
>>                   * __next
>> =                                                  \
>>               exec_node_data(__type, (__node)->__field.next,
>> __field);        \
>>            (__node)->__field.next !=
>> NULL;                                    \
>> @@ -690,7 +678,7 @@ inline void exec_node::insert_before(exec_list *before)
>>
>>    #define foreach_list_typed_safe_reverse(__type, __node, __field,
>> __list)   \
>>       for (__type * __node
>> =                                                  \
>> -           exec_node_data(__type, (__list)->tail_pred,
>> __field),           \
>> +           exec_node_data(__type, (__list)->tail_sentinel.prev,
>> __field),  \
>>                   * __prev
>> =                                                  \
>>               exec_node_data(__type, (__node)->__field.prev,
>> __field);        \
>>            (__node)->__field.prev !=
>> NULL;                                    \
>> diff --git a/src/glsl/lower_clip_distance.cpp
>> b/src/glsl/lower_clip_distance.cpp
>> index 01f028b..f8e130b 100644
>> --- a/src/glsl/lower_clip_distance.cpp
>> +++ b/src/glsl/lower_clip_distance.cpp
>> @@ -476,8 +476,10 @@ lower_clip_distance_visitor::visit_leave(ir_call *ir)
>>    {
>>       void *ctx = ralloc_parent(ir);
>>
>> -   const exec_node *formal_param_node = ir->callee->parameters.head;
>> -   const exec_node *actual_param_node = ir->actual_parameters.head;
>> +   const exec_node *formal_param_node =
>> +         ir->callee->parameters.head_sentinel.next;
>> +   const exec_node *actual_param_node =
>> +         ir->actual_parameters.head_sentinel.next;
>>       while (!actual_param_node->is_tail_sentinel()) {
>>          ir_variable *formal_param = (ir_variable *) formal_param_node;
>>          ir_rvalue *actual_param = (ir_rvalue *) actual_param_node;
>> diff --git a/src/glsl/lower_jumps.cpp b/src/glsl/lower_jumps.cpp
>> index ec7a0c5..cdc362e 100644
>> --- a/src/glsl/lower_jumps.cpp
>> +++ b/src/glsl/lower_jumps.cpp
>> @@ -777,7 +777,7 @@ lower_continue:
>>                    * analysis.
>>                    */
>>                   exec_list list;
>> -               list.head = next;
>> +               list.head_sentinel.next = next;
>>                   block_records[move_into] = visit_block(&list);
>>
>>                   /*
>> diff --git a/src/glsl/lower_packed_varyings.cpp
>> b/src/glsl/lower_packed_varyings.cpp
>> index d8bebb5..32118e5 100644
>> --- a/src/glsl/lower_packed_varyings.cpp
>> +++ b/src/glsl/lower_packed_varyings.cpp
>> @@ -719,7 +719,8 @@ lower_packed_varyings(void *mem_ctx, unsigned
>> locations_used,
>>             lower_packed_varyings_gs_splicer splicer(mem_ctx,
>> &new_instructions);
>>
>>             /* Add all the variables in first. */
>> - main_func_sig->body.head->insert_before(&new_variables);
>> +         main_func_sig->body.head_sentinel.next->
>> +               insert_before(&new_variables);
>>
>>             /* Now update all the EmitVertex instances */
>>             splicer.run(instructions);
>> @@ -732,7 +733,8 @@ lower_packed_varyings(void *mem_ctx, unsigned
>> locations_used,
>>          }
>>       } else {
>>          /* Shader inputs need to be lowered at the beginning of main() */
>> - main_func_sig->body.head->insert_before(&new_instructions);
>> - main_func_sig->body.head->insert_before(&new_variables);
>> +      main_func_sig->body.head_sentinel.next->
>> +            insert_before(&new_instructions);
>> + main_func_sig->body.head_sentinel.next->insert_before(&new_variables);
>>       }
>>    }
>> diff --git a/src/glsl/nir/nir.c b/src/glsl/nir/nir.c
>> index f03e80a..5338b68 100644
>> --- a/src/glsl/nir/nir.c
>> +++ b/src/glsl/nir/nir.c
>> @@ -1169,14 +1169,16 @@ nir_cf_node_insert_before(nir_cf_node *node,
>> nir_cf_node *before)
>>    void
>>    nir_cf_node_insert_begin(struct exec_list *list, nir_cf_node *node)
>>    {
>> -   nir_cf_node *begin = exec_node_data(nir_cf_node, list->head, node);
>> +   nir_cf_node *begin = exec_node_data(nir_cf_node,
>> +         list->head_sentinel.next, node);
>>       nir_cf_node_insert_before(begin, node);
>>    }
>>
>>    void
>>    nir_cf_node_insert_end(struct exec_list *list, nir_cf_node *node)
>>    {
>> -   nir_cf_node *end = exec_node_data(nir_cf_node, list->tail_pred, node);
>> +   nir_cf_node *end = exec_node_data(nir_cf_node,
>> +         list->tail_sentinel.prev, node);
>>       nir_cf_node_insert_after(end, node);
>>    }
>>
>> diff --git a/src/glsl/nir/nir_opt_gcm.c b/src/glsl/nir/nir_opt_gcm.c
>> index 44068bf..f091118 100644
>> --- a/src/glsl/nir/nir_opt_gcm.c
>> +++ b/src/glsl/nir/nir_opt_gcm.c
>> @@ -477,7 +477,7 @@ opt_gcm_impl(nir_function_impl *impl)
>>
>>       while (!exec_list_is_empty(&state.instrs)) {
>>          nir_instr *instr = exec_node_data(nir_instr,
>> -                                        state.instrs.tail_pred, node);
>> +                                state.instrs.tail_sentinel.prev, node);
>>          gcm_place_instr(instr, &state);
>>       }
>>
>> diff --git a/src/glsl/opt_conditional_discard.cpp
>> b/src/glsl/opt_conditional_discard.cpp
>> index 8a3ad24..033033b 100644
>> --- a/src/glsl/opt_conditional_discard.cpp
>> +++ b/src/glsl/opt_conditional_discard.cpp
>> @@ -65,13 +65,15 @@ opt_conditional_discard_visitor::visit_leave(ir_if *ir)
>>    {
>>       /* Look for "if (...) discard" with no else clause or extra
>> statements. */
>>       if (ir->then_instructions.is_empty() ||
>> - !ir->then_instructions.head->next->is_tail_sentinel() ||
>> -       !((ir_instruction *) ir->then_instructions.head)->as_discard() ||
>> + !ir->then_instructions.head_sentinel.next->next->is_tail_sentinel() ||
>> +       !((ir_instruction *) ir->then_instructions.head_sentinel.next)->
>> + as_discard() ||
>>           !ir->else_instructions.is_empty())
>>          return visit_continue;
>>
>>       /* Move the condition and replace the ir_if with the ir_discard. */
>> -   ir_discard *discard = (ir_discard *) ir->then_instructions.head;
>> +   ir_discard *discard =
>> +         (ir_discard *) ir->then_instructions.head_sentinel.next;
>>       discard->condition = ir->condition;
>>       ir->replace_with(discard);
>>
>> diff --git a/src/glsl/opt_constant_variable.cpp
>> b/src/glsl/opt_constant_variable.cpp
>> index 7222eb9..79f27e8 100644
>> --- a/src/glsl/opt_constant_variable.cpp
>> +++ b/src/glsl/opt_constant_variable.cpp
>> @@ -175,7 +175,8 @@ do_constant_variable(exec_list *instructions)
>>       while (!v.list.is_empty()) {
>>
>>          struct assignment_entry *entry;
>> -      entry = exec_node_data(struct assignment_entry, v.list.head, link);
>> +      entry = exec_node_data(struct assignment_entry,
>> +                             v.list.head_sentinel.next, link);
>>
>>          if (entry->assignment_count == 1 && entry->constval &&
>> entry->our_scope) {
>>         entry->var->constant_value = entry->constval;
>> diff --git a/src/glsl/opt_dead_builtin_varyings.cpp
>> b/src/glsl/opt_dead_builtin_varyings.cpp
>> index 31719d2..b4a2a90 100644
>> --- a/src/glsl/opt_dead_builtin_varyings.cpp
>> +++ b/src/glsl/opt_dead_builtin_varyings.cpp
>> @@ -373,7 +373,7 @@ public:
>>                   new_var[i]->data.explicit_index = 0;
>>                }
>>
>> -            ir->head->insert_before(new_var[i]);
>> + ir->head_sentinel.next->insert_before(new_var[i]);
>>             }
>>          }
>>       }
>> diff --git a/src/glsl/opt_flatten_nested_if_blocks.cpp
>> b/src/glsl/opt_flatten_nested_if_blocks.cpp
>> index c702102..de87dfc 100644
>> --- a/src/glsl/opt_flatten_nested_if_blocks.cpp
>> +++ b/src/glsl/opt_flatten_nested_if_blocks.cpp
>> @@ -90,7 +90,8 @@ nested_if_flattener::visit_leave(ir_if *ir)
>>       if (ir->then_instructions.is_empty() ||
>> !ir->else_instructions.is_empty())
>>          return visit_continue;
>>
>> -   ir_if *inner = ((ir_instruction *) ir->then_instructions.head)->as_if();
>> +   ir_if *inner =
>> +      ((ir_instruction *)
>> ir->then_instructions.head_sentinel.next)->as_if();
>>       if (!inner || !inner->next->is_tail_sentinel() ||
>>           !inner->else_instructions.is_empty())
>>          return visit_continue;
>> diff --git a/src/mesa/drivers/dri/i965/brw_cfg.h
>> b/src/mesa/drivers/dri/i965/brw_cfg.h
>> index a094917..f44524b 100644
>> --- a/src/mesa/drivers/dri/i965/brw_cfg.h
>> +++ b/src/mesa/drivers/dri/i965/brw_cfg.h
>> @@ -313,12 +313,12 @@ struct cfg_t {
>>    #define foreach_inst_in_block(__type, __inst, __block)         \
>>       foreach_in_list(__type, __inst, &(__block)->instructions)
>>
>> -#define foreach_inst_in_block_safe(__type, __inst, __block)    \
>> -   for (__type *__inst = (__type *)__block->instructions.head, \
>> -               *__next = (__type *)__inst->next,               \
>> -               *__end = (__type *)__block->instructions.tail;  \
>> -        __next != __end;                                       \
>> -        __inst = __next,                                       \
>> +#define foreach_inst_in_block_safe(__type, __inst,
>> __block)                  \
>> +   for (__type *__inst = (__type
>> *)__block->instructions.head_sentinel.next, \
>> +               *__next = (__type
>> *)__inst->next,                             \
>> +               *__end = (__type
>> *)__block->instructions.tail_sentinel.prev;  \

This one's wrong too - should be 
'__block->instructions.tail_sentinel.next' (or even just 'NULL'). With 
that also fixed, it seems to be ok. I'll try to get some performance 
measurements done soon.

Davin



More information about the mesa-dev mailing list