[Mesa-dev] [PATCH] Fix strict-aliasing violations in GLSL shader list implementation

Connor Abbott cwabbott0 at gmail.com
Wed Jun 24 11:41:00 PDT 2015


On Wed, Jun 24, 2015 at 3:59 PM, Davin McCall <davmac at davmac.org> wrote:
> Hi Ian,
>
> On 23/06/15 23:26, Ian Romanick wrote:
>
> On 06/23/2015 02:36 PM, Thomas Helland wrote:
>
> 2015-06-24 23:05 GMT+02:00 Davin McCall <davmac at davmac.org>:
>
> Hi - I'm new here.
>
> I've recently started poking the Mesa codebase for little reason other than
> personal interest. In the "help wanted" section of the website it mentions
> aliasing violations as a target for newcomers to fix, so with that in mind
> I've attached a patch (against git head) which resolves a few of them, by
> targeting the linked list implementation (list.h) used in the GLSL
> compiler/optimizers. This change slightly increases the storage requirements
> for a list (adds one word) but resolves the blatant aliasing violation that
> was caused by the trick used to conserve that word in the first place.
>
> (I toyed with another approach - using a single sentinel node for both the
> head and tail of a list - but this was much more invasive, and meant that
> you could no longer check whether a particular node was a sentinel node
> unless you had a reference to the list, so I gave up and went with this
> simpler approach).
>
> The most essential change is in the 'exec_list' structure. Three fields
> 'head', 'tail' and 'tail_pred' are removed, and two separate sentinel nodes
> are inserted in their place. The old 'head' is replaced by
> 'head_sentinel.next', 'tail_pred' by 'tail_sentinel.prev', and tail (always
> NULL) by 'head_sentinel.prev' and 'tail_sentinel.next' (both always NULL).
>
> NAK.  The datastructure is correct as-is.  It has been in common use
> since at least 1985.  See the references in the header file.
>
>
> I understand the data structure and how it is supposed to work; the issue is
> that the trick it employs cannot be implemented in C without breaking the
> strict aliasing rules (or at least, the current implementation in Mesa
> breaks the strict aliasing rules). The current code works correctly only
> with the -fno-strict-aliasing compiler option. The issue is that a pair of
> 'exec_node *' do not constitute an exec_node in the eyes of the language
> spec, even though exec_node is declared as holding two such pointers.
> Consider (from src/glsl/list.h):
>
> static inline void
> exec_list_make_empty(struct exec_list *list)
> {
>    list->head = (struct exec_node *) & list->tail;
>    list->tail = NULL;
>    list->tail_pred = (struct exec_node *) & list->head;
> }
>
>
> 'list->head' is of type 'struct exec_node *', and so should point at a
> 'struct  exec_node'. In the code above it is instead co-erced to point at a
> 'struct exec_node *' (list->tail). That in itself doesn't break the alias
> rules, but then:
>
> static inline bool
> exec_node_is_tail_sentinel(const struct exec_node *n)
> {
>    return n->next == NULL;
> }
>
>
> In 'exec_node_is_tail_sentinel', the sentinel is not actually an exec_node -
> it is &list->tail. So, if the parameter n does refer to the sentinel, then
> it does not point to an actual exec_node structure. However, it is
> de-referenced (by 'n->next') which breaks the strict aliasing rules. This
> means that the method above can only ever return false, unless it violates
> the aliasing rules.
>
> (The above method could be fixed by casting n to an 'struct exec_node **'
> and then comparing '**n' against NULL. However, there are other similar
> examples throughout the code that I do not think would be so trivial).
>
> I can quote the relevant parts of the standard if necessary, but your tone
> somewhat implies that you wouldn't even consider this patch?
>
> Davin
>
>
> _______________________________________________
> mesa-dev mailing list
> mesa-dev at lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>

(I couldn't find your original message for some reason, so I'm
replying to this one...)

I think a better solution would be to use __attribute((__may_alias__))
for this case, which is supported at least on GCC and clang (I'm not
sure how recent though) and fall back to -fno-strict-aliasing if it
isn't available. It seems reasonable to require the use of an escape
hatch if we want to use a compiler that takes advantage of the strict
aliasing rules.

On the other hand, I share the concern of other people that it's going
to be very, very difficult to tell whether we're violating strict
aliasing rules. GCC does have -Wstrict-aliasing, but that won't catch
everything, and sometimes these sorts of issues can fester for a while
before a specific combination of compiler flags/code paths/phase of
the moon etc. causes it to break. And the fact that a lot of
developers don't even know about these rules makes things even worse.
We'd need to do some *serious* testing before enabling it both to
justify the perf gain and to root out any bugs, and even then there's
no guarantee that we won't get bitten by it years down the line.


More information about the mesa-dev mailing list