[Mesa-dev] [PATCH v3] glsl: fix some strict aliasing issues in exec_list
Eero Tamminen
eero.t.tamminen at intel.com
Wed Jul 1 08:46:44 PDT 2015
Hi,
On 06/30/2015 01:15 AM, Davin McCall wrote:
> This is the third iteration of a patch to resolve a strict aliasing
> problem in the exec_list structure.
>
> At the suggestion of Francisco Jerez this is a return to the original
> (v1) style of the patch, which replaces the three 'exec_node *' members
> in the list with two 'struct exec_node' members.
>
> Some performance/memory measurements (made by replaying a Dota 2 trace,
> using i965 driver with INTEL_NO_HW=1). All values are averages over 5
> runs. "MRSS" is maximum resident set size as reported by GNU "time" (I'm
> not sure I trust its value, but I believe it's indicative of relative
> memory use).
While RSS gives rough estimate of relative change, resident set size is
affected by other running processes (they can cause pages to be dropped
due to memory pressure or cause pages for parts of libs you're not using
to get paged in).
When you want to track how much memory allocations take from the system,
it's better (but more work) to sum private dirty values from
/proc/PID/smaps, as heap & stack are not shared with other processes.
E.g. smem tool can do that for you.
> So, there is ever-so-slight performance decrease when the patch is
> applied but the lost performance is mostly recovered by removing
> -fno-strict-aliasing from the compiler flags. (It is difficult to
> understand why V2 in particular has any performance degradation at all;
> I suspect the compiler isn't doing as good a job as it could. I'm using
> GCC 4.8.4 with "-O3 -fomit-frame-pointer -march=corei7").
Btw. -O3 doesn't necessarily produce faster code than -O2. -O2 already
enables strict aliasing and -O3 can do a lot more inlining, especially
with C++ code using a lot of templates, and this can affect how well
things fit to cache.
- Eero
> V2 and V3 of the patch are comparable in performance. V3 (this one) uses
> more memory due to the increase in size of the exec_list structure, but
> is arguably cleaner code.
>
>
>
> Original description:
>
> There is a problem in exec_list due to it directly including a trio
> of 'struct exec_node *' members to implement two overlapping sentinel
> nodes. The sentinel nodes do not exist as exec_node objects and so
> should not be accessed as such, according to C99 6.5 paragraph 7.
> When this strict aliasing rule is violated the compiler may re-order
> reads and writes in unexpected ways. The problem only manifests if
> compiling without -fno-strict-aliasing, since that option allows
> breaking the strict aliasing rules.
>
> This patch addresses the issue by including explicit head and tail
> sentinel nodes into the exec_list structure, which do not overlap.
> This adds a single word of storage to the size of the exec_list structure.
>
> While I'm not proposing that -fno-strict-aliasing no longer be used for
> Mesa builds, this patch represents a step in that direction. With this
> patch applied, a working Mesa library can be built, although bugs may
> be present (and could be triggered only when using particular compiler
> versions and options on particular platforms).
> ---
> src/glsl/ast_function.cpp | 20 +++--
> src/glsl/ast_to_hir.cpp | 9 +-
> src/glsl/glsl_parser_extras.cpp | 6 +-
> src/glsl/ir.cpp | 8 +-
> src/glsl/ir_clone.cpp | 2 +-
> src/glsl/ir_constant_expression.cpp | 3 +-
> src/glsl/ir_function.cpp | 14 ++--
> src/glsl/ir_reader.cpp | 5 +-
> src/glsl/ir_validate.cpp | 5 +-
> src/glsl/list.h | 130
> +++++++++++++----------------
> src/glsl/lower_clip_distance.cpp | 6 +-
> src/glsl/lower_jumps.cpp | 2 +-
> src/glsl/lower_packed_varyings.cpp | 8 +-
> src/glsl/nir/nir.c | 6 +-
> src/glsl/nir/nir_opt_gcm.c | 2 +-
> src/glsl/opt_conditional_discard.cpp | 8 +-
> src/glsl/opt_constant_variable.cpp | 3 +-
> src/glsl/opt_dead_builtin_varyings.cpp | 2 +-
> src/glsl/opt_flatten_nested_if_blocks.cpp | 3 +-
> src/mesa/drivers/dri/i965/brw_cfg.h | 12 +--
> src/mesa/drivers/dri/i965/brw_fs_builder.h | 2 +-
> 21 files changed, 130 insertions(+), 126 deletions(-)
>
> diff --git a/src/glsl/ast_function.cpp b/src/glsl/ast_function.cpp
> index 92e26bf..348e615 100644
> --- a/src/glsl/ast_function.cpp
> +++ b/src/glsl/ast_function.cpp
> @@ -152,8 +152,8 @@ verify_parameter_modes(_mesa_glsl_parse_state *state,
> exec_list &actual_ir_parameters,
> exec_list &actual_ast_parameters)
> {
> - exec_node *actual_ir_node = actual_ir_parameters.head;
> - exec_node *actual_ast_node = actual_ast_parameters.head;
> + exec_node *actual_ir_node = actual_ir_parameters.head_sentinel.next;
> + exec_node *actual_ast_node = actual_ast_parameters.head_sentinel.next;
>
> foreach_in_list(const ir_variable, formal, &sig->parameters) {
> /* The lists must be the same length. */
> @@ -964,7 +964,7 @@ constant_record_constructor(const glsl_type
> *constructor_type,
> bool
> single_scalar_parameter(exec_list *parameters)
> {
> - const ir_rvalue *const p = (ir_rvalue *) parameters->head;
> + const ir_rvalue *const p = (ir_rvalue *)
> parameters->head_sentinel.next;
> assert(((ir_rvalue *)p)->as_rvalue() != NULL);
>
> return (p->type->is_scalar() && p->next->is_tail_sentinel());
> @@ -1008,7 +1008,7 @@ emit_inline_vector_constructor(const glsl_type *type,
> */
> const unsigned lhs_components = type->components();
> if (single_scalar_parameter(parameters)) {
> - ir_rvalue *first_param = (ir_rvalue *)parameters->head;
> + ir_rvalue *first_param = (ir_rvalue
> *)parameters->head_sentinel.next;
> ir_rvalue *rhs = new(ctx) ir_swizzle(first_param, 0, 0, 0, 0,
> lhs_components);
> ir_dereference_variable *lhs = new(ctx)
> ir_dereference_variable(var);
> @@ -1209,7 +1209,8 @@ emit_inline_matrix_constructor(const glsl_type *type,
> * to the upper left portion of the constructed matrix, and the
> remaining
> * elements take values from the identity matrix.
> */
> - ir_rvalue *const first_param = (ir_rvalue *) parameters->head;
> + ir_rvalue *const first_param =
> + (ir_rvalue *) parameters->head_sentinel.next;
> if (single_scalar_parameter(parameters)) {
> /* Assign the scalar to the X component of a vec4, and fill the
> remaining
> * components with zero.
> @@ -1454,7 +1455,7 @@ emit_inline_record_constructor(const glsl_type *type,
>
> instructions->push_tail(var);
>
> - exec_node *node = parameters->head;
> + exec_node *node = parameters->head_sentinel.next;
> for (unsigned i = 0; i < type->length; i++) {
> assert(!node->is_tail_sentinel());
>
> @@ -1487,7 +1488,7 @@ process_record_constructor(exec_list *instructions,
> process_parameters(instructions, &actual_parameters,
> parameters, state);
>
> - exec_node *node = actual_parameters.head;
> + exec_node *node = actual_parameters.head_sentinel.next;
> for (unsigned i = 0; i < constructor_type->length; i++) {
> ir_rvalue *ir = (ir_rvalue *) node;
>
> @@ -1751,8 +1752,9 @@ ast_function_expression::hir(exec_list *instructions,
> if (all_parameters_are_constant) {
> return new(ctx) ir_constant(constructor_type, &actual_parameters);
> } else if (constructor_type->is_scalar()) {
> - return dereference_component((ir_rvalue *) actual_parameters.head,
> - 0);
> + return dereference_component(
> + (ir_rvalue *) actual_parameters.head_sentinel.next,
> + 0);
> } else if (constructor_type->is_vector()) {
> return emit_inline_vector_constructor(constructor_type,
> instructions,
> diff --git a/src/glsl/ast_to_hir.cpp b/src/glsl/ast_to_hir.cpp
> index 8cb46be..7452482 100644
> --- a/src/glsl/ast_to_hir.cpp
> +++ b/src/glsl/ast_to_hir.cpp
> @@ -1761,7 +1761,7 @@ ast_expression::do_hir(exec_list *instructions,
> * effect, but I don't think these cases exist in GLSL.
> Either way,
> * it would be a giant hassle to replicate that behavior.
> */
> - if (previous_tail_pred == instructions->tail_pred) {
> + if (previous_tail_pred == instructions->tail_sentinel.prev) {
> _mesa_glsl_warning(&previous_operand_loc, state,
> "left-hand operand of comma expression
> has "
> "no effect");
> @@ -1772,7 +1772,7 @@ ast_expression::do_hir(exec_list *instructions,
> * return NULL when the list is empty. We don't care about that
> * here, so using tail_pred directly is fine.
> */
> - previous_tail_pred = instructions->tail_pred;
> + previous_tail_pred = instructions->tail_sentinel.prev;
> previous_operand_loc = ast->get_location();
>
> result = ast->hir(instructions, state);
> @@ -1923,8 +1923,9 @@ process_array_type(YYLTYPE *loc, const glsl_type
> *base,
> }
> }
>
> - for (exec_node *node = array_specifier->array_dimensions.tail_pred;
> - !node->is_head_sentinel(); node = node->prev) {
> + for (exec_node *node =
> + array_specifier->array_dimensions.tail_sentinel.prev;
> + !node->is_head_sentinel(); node = node->prev) {
> unsigned array_size = process_array_size(node, state);
> array_type = glsl_type::get_array_instance(array_type,
> array_size);
> }
> diff --git a/src/glsl/glsl_parser_extras.cpp
> b/src/glsl/glsl_parser_extras.cpp
> index 046d5d7..91079ec 100644
> --- a/src/glsl/glsl_parser_extras.cpp
> +++ b/src/glsl/glsl_parser_extras.cpp
> @@ -783,7 +783,7 @@ _mesa_ast_set_aggregate_type(const glsl_type *type,
> * E.g., if <type> if struct S[2] we want to set each element's
> type to
> * struct S.
> */
> - for (exec_node *expr_node = ai->expressions.head;
> + for (exec_node *expr_node = ai->expressions.head_sentinel.next;
> !expr_node->is_tail_sentinel();
> expr_node = expr_node->next) {
> ast_expression *expr = exec_node_data(ast_expression, expr_node,
> @@ -795,7 +795,7 @@ _mesa_ast_set_aggregate_type(const glsl_type *type,
>
> /* If the aggregate is a struct, recursively set its fields' types. */
> } else if (type->is_record()) {
> - exec_node *expr_node = ai->expressions.head;
> + exec_node *expr_node = ai->expressions.head_sentinel.next;
>
> /* Iterate through the struct's fields. */
> for (unsigned i = 0; !expr_node->is_tail_sentinel() && i <
> type->length;
> @@ -809,7 +809,7 @@ _mesa_ast_set_aggregate_type(const glsl_type *type,
> }
> /* If the aggregate is a matrix, set its columns' types. */
> } else if (type->is_matrix()) {
> - for (exec_node *expr_node = ai->expressions.head;
> + for (exec_node *expr_node = ai->expressions.head_sentinel.next;
> !expr_node->is_tail_sentinel();
> expr_node = expr_node->next) {
> ast_expression *expr = exec_node_data(ast_expression, expr_node,
> diff --git a/src/glsl/ir.cpp b/src/glsl/ir.cpp
> index dbd064f..7b1b396 100644
> --- a/src/glsl/ir.cpp
> +++ b/src/glsl/ir.cpp
> @@ -786,7 +786,7 @@ ir_constant::ir_constant(const struct glsl_type
> *type, exec_list *value_list)
> this->value.u[i] = 0;
> }
>
> - ir_constant *value = (ir_constant *) (value_list->head);
> + ir_constant *value = (ir_constant *) (value_list->head_sentinel.next);
>
> /* Constructors with exactly one scalar argument are special for
> vectors
> * and matrices. For vectors, the scalar value is replicated to
> fill all
> @@ -1049,7 +1049,7 @@ ir_constant::get_record_field(const char *name)
> if (this->components.is_empty())
> return NULL;
>
> - exec_node *node = this->components.head;
> + exec_node *node = this->components.head_sentinel.next;
> for (int i = 0; i < idx; i++) {
> node = node->next;
>
> @@ -1173,8 +1173,8 @@ ir_constant::has_value(const ir_constant *c) const
> }
>
> if (this->type->base_type == GLSL_TYPE_STRUCT) {
> - const exec_node *a_node = this->components.head;
> - const exec_node *b_node = c->components.head;
> + const exec_node *a_node = this->components.head_sentinel.next;
> + const exec_node *b_node = c->components.head_sentinel.next;
>
> while (!a_node->is_tail_sentinel()) {
> assert(!b_node->is_tail_sentinel());
> diff --git a/src/glsl/ir_clone.cpp b/src/glsl/ir_clone.cpp
> index 914e0e4..c1db237 100644
> --- a/src/glsl/ir_clone.cpp
> +++ b/src/glsl/ir_clone.cpp
> @@ -335,7 +335,7 @@ ir_constant::clone(void *mem_ctx, struct hash_table
> *ht) const
> ir_constant *c = new(mem_ctx) ir_constant;
>
> c->type = this->type;
> - for (exec_node *node = this->components.head
> + for (exec_node *node = this->components.head_sentinel.next
> ; !node->is_tail_sentinel()
> ; node = node->next) {
> ir_constant *const orig = (ir_constant *) node;
> diff --git a/src/glsl/ir_constant_expression.cpp
> b/src/glsl/ir_constant_expression.cpp
> index 171b8e9..dc0c98c 100644
> --- a/src/glsl/ir_constant_expression.cpp
> +++ b/src/glsl/ir_constant_expression.cpp
> @@ -2114,7 +2114,8 @@
> ir_function_signature::constant_expression_value(exec_list
> *actual_parameters, s
> * have to use the variable objects from the object with the body,
> * but the parameter instanciation on the current object.
> */
> - const exec_node *parameter_info = origin ? origin->parameters.head :
> parameters.head;
> + const exec_node *parameter_info = origin ?
> + origin->parameters.head_sentinel.next :
> parameters.head_sentinel.next;
>
> foreach_in_list(ir_rvalue, n, actual_parameters) {
> ir_constant *constant =
> n->constant_expression_value(variable_context);
> diff --git a/src/glsl/ir_function.cpp b/src/glsl/ir_function.cpp
> index 1319443..8cbd5b0 100644
> --- a/src/glsl/ir_function.cpp
> +++ b/src/glsl/ir_function.cpp
> @@ -43,8 +43,8 @@ static parameter_list_match_t
> parameter_lists_match(_mesa_glsl_parse_state *state,
> const exec_list *list_a, const exec_list *list_b)
> {
> - const exec_node *node_a = list_a->head;
> - const exec_node *node_b = list_b->head;
> + const exec_node *node_a = list_a->head_sentinel.next;
> + const exec_node *node_b = list_b->head_sentinel.next;
>
> /* This is set to true if there is an inexact match requiring an
> implicit
> * conversion. */
> @@ -221,9 +221,9 @@ is_best_inexact_overload(const exec_list
> *actual_parameters,
> if (*other == sig)
> continue;
>
> - const exec_node *node_a = sig->parameters.head;
> - const exec_node *node_b = (*other)->parameters.head;
> - const exec_node *node_p = actual_parameters->head;
> + const exec_node *node_a = sig->parameters.head_sentinel.next;
> + const exec_node *node_b = (*other)->parameters.head_sentinel.next;
> + const exec_node *node_p = actual_parameters->head_sentinel.next;
>
> bool better_for_some_parameter = false;
>
> @@ -365,8 +365,8 @@
> ir_function::matching_signature(_mesa_glsl_parse_state *state,
> static bool
> parameter_lists_match_exact(const exec_list *list_a, const exec_list
> *list_b)
> {
> - const exec_node *node_a = list_a->head;
> - const exec_node *node_b = list_b->head;
> + const exec_node *node_a = list_a->head_sentinel.next;
> + const exec_node *node_b = list_b->head_sentinel.next;
>
> for (/* empty */
> ; !node_a->is_tail_sentinel() && !node_b->is_tail_sentinel()
> diff --git a/src/glsl/ir_reader.cpp b/src/glsl/ir_reader.cpp
> index 4eae413..90c4b29 100644
> --- a/src/glsl/ir_reader.cpp
> +++ b/src/glsl/ir_reader.cpp
> @@ -208,7 +208,8 @@ ir_reader::read_function(s_expression *expr, bool
> skip_body)
> /* Skip over "function" tag and function name (which are guaranteed
> to be
> * present by the above PARTIAL_MATCH call).
> */
> - exec_node *node = ((s_list *) expr)->subexpressions.head->next->next;
> + exec_node *node =
> + ((s_list *) expr)->subexpressions.head_sentinel.next->next->next;
> for (/* nothing */; !node->is_tail_sentinel(); node = node->next) {
> s_expression *s_sig = (s_expression *) node;
> read_function_sig(f, s_sig, skip_body);
> @@ -251,7 +252,7 @@ ir_reader::read_function_sig(ir_function *f,
> s_expression *expr, bool skip_body)
> state->symbols->push_scope();
>
> /* Skip over the "parameters" tag. */
> - exec_node *node = paramlist->subexpressions.head->next;
> + exec_node *node = paramlist->subexpressions.head_sentinel.next->next;
> for (/* nothing */; !node->is_tail_sentinel(); node = node->next) {
> ir_variable *var = read_declaration((s_expression *) node);
> if (var == NULL)
> diff --git a/src/glsl/ir_validate.cpp b/src/glsl/ir_validate.cpp
> index cfe0df3..f5efcc5 100644
> --- a/src/glsl/ir_validate.cpp
> +++ b/src/glsl/ir_validate.cpp
> @@ -842,8 +842,9 @@ ir_validate::visit_enter(ir_call *ir)
> abort();
> }
>
> - const exec_node *formal_param_node = callee->parameters.head;
> - const exec_node *actual_param_node = ir->actual_parameters.head;
> + const exec_node *formal_param_node =
> callee->parameters.head_sentinel.next;
> + const exec_node *actual_param_node =
> + ir->actual_parameters.head_sentinel.next;
> while (true) {
> if (formal_param_node->is_tail_sentinel()
> != actual_param_node->is_tail_sentinel()) {
> diff --git a/src/glsl/list.h b/src/glsl/list.h
> index 15fcd4a..ee9145e 100644
> --- a/src/glsl/list.h
> +++ b/src/glsl/list.h
> @@ -32,24 +32,8 @@
> *
> * A list is empty if either the head sentinel's \c next pointer
> points to the
> * tail sentinel or the tail sentinel's \c prev poiner points to the head
> - * sentinel.
> - *
> - * Instead of tracking two separate \c node structures and a \c list
> structure
> - * that points to them, the sentinel nodes are in a single structure.
> Noting
> - * that each sentinel node always has one \c NULL pointer, the \c NULL
> - * pointers occupy the same memory location. In the \c list structure
> - * contains a the following:
> - *
> - * - A \c head pointer that represents the \c next pointer of the
> - * head sentinel node.
> - * - A \c tail pointer that represents the \c prev pointer of the head
> - * sentinel node and the \c next pointer of the tail sentinel
> node. This
> - * pointer is \b always \c NULL.
> - * - A \c tail_prev pointer that represents the \c prev pointer of the
> - * tail sentinel node.
> - *
> - * Therefore, if \c head->next is \c NULL or \c tail_prev->prev is \c
> NULL,
> - * the list is empty.
> + * sentinel. The head sentinel and tail sentinel nodes are allocated
> within
> + * the list structure.
> *
> * Do note that this means that the list nodes will contain pointers
> into the
> * list structure itself and as a result you may not \c realloc() an \c
> @@ -307,9 +291,8 @@ struct exec_node;
> #endif
>
> struct exec_list {
> - struct exec_node *head;
> - struct exec_node *tail;
> - struct exec_node *tail_pred;
> + struct exec_node head_sentinel;
> + struct exec_node tail_sentinel;
>
> #ifdef __cplusplus
> DECLARE_RALLOC_CXX_OPERATORS(exec_list)
> @@ -366,9 +349,10 @@ struct exec_list {
> static inline void
> exec_list_make_empty(struct exec_list *list)
> {
> - list->head = (struct exec_node *) & list->tail;
> - list->tail = NULL;
> - list->tail_pred = (struct exec_node *) & list->head;
> + list->head_sentinel.next = & list->tail_sentinel;
> + list->head_sentinel.prev = NULL;
> + list->tail_sentinel.next = NULL;
> + list->tail_sentinel.prev = & list->head_sentinel;
> }
>
> static inline bool
> @@ -376,39 +360,39 @@ exec_list_is_empty(const struct exec_list *list)
> {
> /* There are three ways to test whether a list is empty or not.
> *
> - * - Check to see if the \c head points to the \c tail.
> - * - Check to see if the \c tail_pred points to the \c head.
> - * - Check to see if the \c head is the sentinel node by test
> whether its
> + * - Check to see if the head sentinel's \c next is the tail sentinel.
> + * - Check to see if the tail sentinel's \c prev is the head sentinel.
> + * - Check to see if the head is the sentinel node by test whether its
> * \c next pointer is \c NULL.
> *
> * The first two methods tend to generate better code on modern
> systems
> * because they save a pointer dereference.
> */
> - return list->head == (struct exec_node *) &list->tail;
> + return list->head_sentinel.next == & list->tail_sentinel;
> }
>
> static inline const struct exec_node *
> exec_list_get_head_const(const struct exec_list *list)
> {
> - return !exec_list_is_empty(list) ? list->head : NULL;
> + return !exec_list_is_empty(list) ? list->head_sentinel.next : NULL;
> }
>
> static inline struct exec_node *
> exec_list_get_head(struct exec_list *list)
> {
> - return !exec_list_is_empty(list) ? list->head : NULL;
> + return !exec_list_is_empty(list) ? list->head_sentinel.next : NULL;
> }
>
> static inline const struct exec_node *
> exec_list_get_tail_const(const struct exec_list *list)
> {
> - return !exec_list_is_empty(list) ? list->tail_pred : NULL;
> + return !exec_list_is_empty(list) ? list->tail_sentinel.prev : NULL;
> }
>
> static inline struct exec_node *
> exec_list_get_tail(struct exec_list *list)
> {
> - return !exec_list_is_empty(list) ? list->tail_pred : NULL;
> + return !exec_list_is_empty(list) ? list->tail_sentinel.prev : NULL;
> }
>
> static inline unsigned
> @@ -417,7 +401,8 @@ exec_list_length(const struct exec_list *list)
> unsigned size = 0;
> struct exec_node *node;
>
> - for (node = list->head; node->next != NULL; node = node->next) {
> + for (node = list->head_sentinel.next; node->next != NULL;
> + node = node->next) {
> size++;
> }
>
> @@ -427,21 +412,21 @@ exec_list_length(const struct exec_list *list)
> static inline void
> exec_list_push_head(struct exec_list *list, struct exec_node *n)
> {
> - n->next = list->head;
> - n->prev = (struct exec_node *) &list->head;
> + n->next = list->head_sentinel.next;
> + n->prev = &list->head_sentinel;
>
> n->next->prev = n;
> - list->head = n;
> + list->head_sentinel.next = n;
> }
>
> static inline void
> exec_list_push_tail(struct exec_list *list, struct exec_node *n)
> {
> - n->next = (struct exec_node *) &list->tail;
> - n->prev = list->tail_pred;
> + n->next = &list->tail_sentinel;
> + n->prev = list->tail_sentinel.prev;
>
> n->prev->next = n;
> - list->tail_pred = n;
> + list->tail_sentinel.prev = n;
> }
>
> static inline void
> @@ -449,10 +434,10 @@ exec_list_push_degenerate_list_at_head(struct
> exec_list *list, struct exec_node
> {
> assert(n->prev->next == n);
>
> - n->prev->next = list->head;
> - list->head->prev = n->prev;
> - n->prev = (struct exec_node *) &list->head;
> - list->head = n;
> + n->prev->next = list->head_sentinel.next;
> + list->head_sentinel.next->prev = n->prev;
> + n->prev = &list->head_sentinel;
> + list->head_sentinel.next = n;
> }
>
> static inline struct exec_node *
> @@ -471,12 +456,13 @@ exec_list_move_nodes_to(struct exec_list *list,
> struct exec_list *target)
> if (exec_list_is_empty(list)) {
> exec_list_make_empty(target);
> } else {
> - target->head = list->head;
> - target->tail = NULL;
> - target->tail_pred = list->tail_pred;
> + target->head_sentinel.next = list->head_sentinel.next;
> + target->head_sentinel.prev = NULL;
> + target->tail_sentinel.next = NULL;
> + target->tail_sentinel.prev = list->tail_sentinel.prev;
>
> - target->head->prev = (struct exec_node *) &target->head;
> - target->tail_pred->next = (struct exec_node *) &target->tail;
> + target->head_sentinel.next->prev = &target->head_sentinel;
> + target->tail_sentinel.prev->next = &target->tail_sentinel;
>
> exec_list_make_empty(list);
> }
> @@ -490,13 +476,13 @@ exec_list_append(struct exec_list *list, struct
> exec_list *source)
>
> /* Link the first node of the source with the last node of the
> target list.
> */
> - list->tail_pred->next = source->head;
> - source->head->prev = list->tail_pred;
> + list->tail_sentinel.prev->next = source->head_sentinel.next;
> + source->head_sentinel.next->prev = list->tail_sentinel.prev;
>
> /* Make the tail of the source list be the tail of the target list.
> */
> - list->tail_pred = source->tail_pred;
> - list->tail_pred->next = (struct exec_node *) &list->tail;
> + list->tail_sentinel.prev = source->tail_sentinel.prev;
> + list->tail_sentinel.prev->next = &list->tail_sentinel;
>
> /* Make the source list empty for good measure.
> */
> @@ -516,11 +502,11 @@ exec_node_insert_list_before(struct exec_node *n,
> struct exec_list *before)
> if (exec_list_is_empty(before))
> return;
>
> - before->tail_pred->next = n;
> - before->head->prev = n->prev;
> + before->tail_sentinel.prev->next = n;
> + before->head_sentinel.next->prev = n->prev;
>
> - n->prev->next = before->head;
> - n->prev = before->tail_pred;
> + n->prev->next = before->head_sentinel.next;
> + n->prev = before->tail_sentinel.prev;
>
> exec_list_make_empty(before);
> }
> @@ -530,15 +516,17 @@ exec_list_validate(const struct exec_list *list)
> {
> const struct exec_node *node;
>
> - assert(list->head->prev == (const struct exec_node *) &list->head);
> - assert(list->tail == NULL);
> - assert(list->tail_pred->next == (const struct exec_node *)
> &list->tail);
> + assert(list->head_sentinel.next->prev == &list->head_sentinel);
> + assert(list->head_sentinel.prev == NULL);
> + assert(list->tail_sentinel.next == NULL);
> + assert(list->tail_sentinel.prev->next == &list->tail_sentinel);
>
> /* We could try to use one of the interators below for this but
> they all
> * either require C++ or assume the exec_node is embedded in a
> structure
> * which is not the case for this function.
> */
> - for (node = list->head; node->next != NULL; node = node->next) {
> + for (node = list->head_sentinel.next; node->next != NULL;
> + node = node->next) {
> assert(node->next->prev == node);
> assert(node->prev->next == node);
> }
> @@ -622,12 +610,12 @@ inline void exec_node::insert_before(exec_list
> *before)
> #endif
>
> #define foreach_in_list(__type, __inst, __list) \
> - for (__type *(__inst) = (__type *)(__list)->head; \
> + for (__type *(__inst) = (__type *)(__list)->head_sentinel.next; \
> !(__inst)->is_tail_sentinel(); \
> (__inst) = (__type *)(__inst)->next)
>
> #define foreach_in_list_reverse(__type, __inst, __list) \
> - for (__type *(__inst) = (__type *)(__list)->tail_pred; \
> + for (__type *(__inst) = (__type *)(__list)->tail_sentinel.prev; \
> !(__inst)->is_head_sentinel(); \
> (__inst) = (__type *)(__inst)->prev)
>
> @@ -635,20 +623,20 @@ inline void exec_node::insert_before(exec_list
> *before)
> * This version is safe even if the current node is removed.
> */
> #define foreach_in_list_safe(__type, __node, __list) \
> - for (__type *__node = (__type *)(__list)->head, \
> + for (__type *__node = (__type *)(__list)->head_sentinel.next, \
> *__next = (__type *)__node->next; \
> __next != NULL; \
> __node = __next, __next = (__type *)__next->next)
>
> #define foreach_in_list_reverse_safe(__type, __node, __list) \
> - for (__type *__node = (__type *)(__list)->tail_pred, \
> + for (__type *__node = (__type *)(__list)->tail_sentinel.prev, \
> *__prev = (__type *)__node->prev; \
> __prev != NULL; \
> __node = __prev, __prev = (__type *)__prev->prev)
>
> #define foreach_in_list_use_after(__type, __inst, __list) \
> __type *(__inst); \
> - for ((__inst) = (__type *)(__list)->head; \
> + for ((__inst) = (__type *)(__list)->head_sentinel.next; \
> !(__inst)->is_tail_sentinel(); \
> (__inst) = (__type *)(__inst)->next)
> /**
> @@ -657,8 +645,8 @@ inline void exec_node::insert_before(exec_list *before)
> * This is safe against either current node being removed or replaced.
> */
> #define foreach_two_lists(__node1, __list1, __node2, __list2) \
> - for (struct exec_node * __node1 = (__list1)->head, \
> - * __node2 = (__list2)->head, \
> + for (struct exec_node * __node1 = (__list1)->head_sentinel.next, \
> + * __node2 = (__list2)->head_sentinel.next, \
> * __next1 = __node1->next, \
> * __next2 = __node2->next \
> ; __next1 != NULL && __next2 != NULL \
> @@ -669,19 +657,19 @@ inline void exec_node::insert_before(exec_list
> *before)
>
> #define foreach_list_typed(__type, __node, __field, __list) \
> for (__type * __node = \
> - exec_node_data(__type, (__list)->head, __field); \
> + exec_node_data(__type, (__list)->head_sentinel.next, __field); \
> (__node)->__field.next != NULL; \
> (__node) = exec_node_data(__type, (__node)->__field.next, __field))
>
> #define foreach_list_typed_reverse(__type, __node, __field,
> __list) \
> for (__type * __node
> = \
> - exec_node_data(__type, (__list)->tail_pred, __field); \
> + exec_node_data(__type, (__list)->tail_sentinel.prev,
> __field); \
> (__node)->__field.prev != NULL; \
> (__node) = exec_node_data(__type, (__node)->__field.prev,
> __field))
>
> #define foreach_list_typed_safe(__type, __node, __field,
> __list) \
> for (__type * __node
> = \
> - exec_node_data(__type, (__list)->head,
> __field), \
> + exec_node_data(__type, (__list)->head_sentinel.next,
> __field), \
> * __next
> = \
> exec_node_data(__type, (__node)->__field.next,
> __field); \
> (__node)->__field.next !=
> NULL; \
> @@ -690,7 +678,7 @@ inline void exec_node::insert_before(exec_list *before)
>
> #define foreach_list_typed_safe_reverse(__type, __node, __field,
> __list) \
> for (__type * __node
> = \
> - exec_node_data(__type, (__list)->tail_pred,
> __field), \
> + exec_node_data(__type, (__list)->tail_sentinel.prev,
> __field), \
> * __prev
> = \
> exec_node_data(__type, (__node)->__field.prev,
> __field); \
> (__node)->__field.prev !=
> NULL; \
> diff --git a/src/glsl/lower_clip_distance.cpp
> b/src/glsl/lower_clip_distance.cpp
> index 01f028b..f8e130b 100644
> --- a/src/glsl/lower_clip_distance.cpp
> +++ b/src/glsl/lower_clip_distance.cpp
> @@ -476,8 +476,10 @@ lower_clip_distance_visitor::visit_leave(ir_call *ir)
> {
> void *ctx = ralloc_parent(ir);
>
> - const exec_node *formal_param_node = ir->callee->parameters.head;
> - const exec_node *actual_param_node = ir->actual_parameters.head;
> + const exec_node *formal_param_node =
> + ir->callee->parameters.head_sentinel.next;
> + const exec_node *actual_param_node =
> + ir->actual_parameters.head_sentinel.next;
> while (!actual_param_node->is_tail_sentinel()) {
> ir_variable *formal_param = (ir_variable *) formal_param_node;
> ir_rvalue *actual_param = (ir_rvalue *) actual_param_node;
> diff --git a/src/glsl/lower_jumps.cpp b/src/glsl/lower_jumps.cpp
> index ec7a0c5..cdc362e 100644
> --- a/src/glsl/lower_jumps.cpp
> +++ b/src/glsl/lower_jumps.cpp
> @@ -777,7 +777,7 @@ lower_continue:
> * analysis.
> */
> exec_list list;
> - list.head = next;
> + list.head_sentinel.next = next;
> block_records[move_into] = visit_block(&list);
>
> /*
> diff --git a/src/glsl/lower_packed_varyings.cpp
> b/src/glsl/lower_packed_varyings.cpp
> index d8bebb5..32118e5 100644
> --- a/src/glsl/lower_packed_varyings.cpp
> +++ b/src/glsl/lower_packed_varyings.cpp
> @@ -719,7 +719,8 @@ lower_packed_varyings(void *mem_ctx, unsigned
> locations_used,
> lower_packed_varyings_gs_splicer splicer(mem_ctx,
> &new_instructions);
>
> /* Add all the variables in first. */
> - main_func_sig->body.head->insert_before(&new_variables);
> + main_func_sig->body.head_sentinel.next->
> + insert_before(&new_variables);
>
> /* Now update all the EmitVertex instances */
> splicer.run(instructions);
> @@ -732,7 +733,8 @@ lower_packed_varyings(void *mem_ctx, unsigned
> locations_used,
> }
> } else {
> /* Shader inputs need to be lowered at the beginning of main() */
> - main_func_sig->body.head->insert_before(&new_instructions);
> - main_func_sig->body.head->insert_before(&new_variables);
> + main_func_sig->body.head_sentinel.next->
> + insert_before(&new_instructions);
> + main_func_sig->body.head_sentinel.next->insert_before(&new_variables);
> }
> }
> diff --git a/src/glsl/nir/nir.c b/src/glsl/nir/nir.c
> index f03e80a..5338b68 100644
> --- a/src/glsl/nir/nir.c
> +++ b/src/glsl/nir/nir.c
> @@ -1169,14 +1169,16 @@ nir_cf_node_insert_before(nir_cf_node *node,
> nir_cf_node *before)
> void
> nir_cf_node_insert_begin(struct exec_list *list, nir_cf_node *node)
> {
> - nir_cf_node *begin = exec_node_data(nir_cf_node, list->head, node);
> + nir_cf_node *begin = exec_node_data(nir_cf_node,
> + list->head_sentinel.next, node);
> nir_cf_node_insert_before(begin, node);
> }
>
> void
> nir_cf_node_insert_end(struct exec_list *list, nir_cf_node *node)
> {
> - nir_cf_node *end = exec_node_data(nir_cf_node, list->tail_pred, node);
> + nir_cf_node *end = exec_node_data(nir_cf_node,
> + list->tail_sentinel.prev, node);
> nir_cf_node_insert_after(end, node);
> }
>
> diff --git a/src/glsl/nir/nir_opt_gcm.c b/src/glsl/nir/nir_opt_gcm.c
> index 44068bf..f091118 100644
> --- a/src/glsl/nir/nir_opt_gcm.c
> +++ b/src/glsl/nir/nir_opt_gcm.c
> @@ -477,7 +477,7 @@ opt_gcm_impl(nir_function_impl *impl)
>
> while (!exec_list_is_empty(&state.instrs)) {
> nir_instr *instr = exec_node_data(nir_instr,
> - state.instrs.tail_pred, node);
> + state.instrs.tail_sentinel.prev, node);
> gcm_place_instr(instr, &state);
> }
>
> diff --git a/src/glsl/opt_conditional_discard.cpp
> b/src/glsl/opt_conditional_discard.cpp
> index 8a3ad24..033033b 100644
> --- a/src/glsl/opt_conditional_discard.cpp
> +++ b/src/glsl/opt_conditional_discard.cpp
> @@ -65,13 +65,15 @@ opt_conditional_discard_visitor::visit_leave(ir_if *ir)
> {
> /* Look for "if (...) discard" with no else clause or extra
> statements. */
> if (ir->then_instructions.is_empty() ||
> - !ir->then_instructions.head->next->is_tail_sentinel() ||
> - !((ir_instruction *) ir->then_instructions.head)->as_discard() ||
> + !ir->then_instructions.head_sentinel.next->next->is_tail_sentinel() ||
> + !((ir_instruction *) ir->then_instructions.head_sentinel.next)->
> + as_discard() ||
> !ir->else_instructions.is_empty())
> return visit_continue;
>
> /* Move the condition and replace the ir_if with the ir_discard. */
> - ir_discard *discard = (ir_discard *) ir->then_instructions.head;
> + ir_discard *discard =
> + (ir_discard *) ir->then_instructions.head_sentinel.next;
> discard->condition = ir->condition;
> ir->replace_with(discard);
>
> diff --git a/src/glsl/opt_constant_variable.cpp
> b/src/glsl/opt_constant_variable.cpp
> index 7222eb9..79f27e8 100644
> --- a/src/glsl/opt_constant_variable.cpp
> +++ b/src/glsl/opt_constant_variable.cpp
> @@ -175,7 +175,8 @@ do_constant_variable(exec_list *instructions)
> while (!v.list.is_empty()) {
>
> struct assignment_entry *entry;
> - entry = exec_node_data(struct assignment_entry, v.list.head, link);
> + entry = exec_node_data(struct assignment_entry,
> + v.list.head_sentinel.next, link);
>
> if (entry->assignment_count == 1 && entry->constval &&
> entry->our_scope) {
> entry->var->constant_value = entry->constval;
> diff --git a/src/glsl/opt_dead_builtin_varyings.cpp
> b/src/glsl/opt_dead_builtin_varyings.cpp
> index 31719d2..b4a2a90 100644
> --- a/src/glsl/opt_dead_builtin_varyings.cpp
> +++ b/src/glsl/opt_dead_builtin_varyings.cpp
> @@ -373,7 +373,7 @@ public:
> new_var[i]->data.explicit_index = 0;
> }
>
> - ir->head->insert_before(new_var[i]);
> + ir->head_sentinel.next->insert_before(new_var[i]);
> }
> }
> }
> diff --git a/src/glsl/opt_flatten_nested_if_blocks.cpp
> b/src/glsl/opt_flatten_nested_if_blocks.cpp
> index c702102..de87dfc 100644
> --- a/src/glsl/opt_flatten_nested_if_blocks.cpp
> +++ b/src/glsl/opt_flatten_nested_if_blocks.cpp
> @@ -90,7 +90,8 @@ nested_if_flattener::visit_leave(ir_if *ir)
> if (ir->then_instructions.is_empty() ||
> !ir->else_instructions.is_empty())
> return visit_continue;
>
> - ir_if *inner = ((ir_instruction *)
> ir->then_instructions.head)->as_if();
> + ir_if *inner =
> + ((ir_instruction *)
> ir->then_instructions.head_sentinel.next)->as_if();
> if (!inner || !inner->next->is_tail_sentinel() ||
> !inner->else_instructions.is_empty())
> return visit_continue;
> diff --git a/src/mesa/drivers/dri/i965/brw_cfg.h
> b/src/mesa/drivers/dri/i965/brw_cfg.h
> index a094917..a632c5f 100644
> --- a/src/mesa/drivers/dri/i965/brw_cfg.h
> +++ b/src/mesa/drivers/dri/i965/brw_cfg.h
> @@ -313,12 +313,12 @@ struct cfg_t {
> #define foreach_inst_in_block(__type, __inst, __block) \
> foreach_in_list(__type, __inst, &(__block)->instructions)
>
> -#define foreach_inst_in_block_safe(__type, __inst, __block) \
> - for (__type *__inst = (__type *)__block->instructions.head, \
> - *__next = (__type *)__inst->next, \
> - *__end = (__type *)__block->instructions.tail; \
> - __next != __end; \
> - __inst = __next, \
> +#define foreach_inst_in_block_safe(__type, __inst,
> __block) \
> + for (__type *__inst = (__type
> *)__block->instructions.head_sentinel.next, \
> + *__next = (__type
> *)__inst->next, \
> + *__end = (__type
> *)__block->instructions.tail_sentinel.next; \
> + __next !=
> __end; \
> + __inst =
> __next, \
> __next = (__type *)__next->next)
>
> #define foreach_inst_in_block_reverse(__type, __inst, __block) \
> diff --git a/src/mesa/drivers/dri/i965/brw_fs_builder.h
> b/src/mesa/drivers/dri/i965/brw_fs_builder.h
> index 58ac598..9f08ea5 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs_builder.h
> +++ b/src/mesa/drivers/dri/i965/brw_fs_builder.h
> @@ -85,7 +85,7 @@ namespace brw {
> fs_builder
> at_end() const
> {
> - return at(NULL, (exec_node *)&shader->instructions.tail);
> + return at(NULL, &shader->instructions.tail_sentinel);
> }
>
> /**
More information about the mesa-dev
mailing list