[Mesa-dev] [PATCH v2] glsls: Modify exec_list to avoid strict-aliasing violations
Davin McCall
davmac at davmac.org
Fri Jun 26 06:26:45 PDT 2015
On 26/06/15 13:18, Francisco Jerez wrote:
> Davin McCall <davmac at davmac.org> writes:
>
>> On 26/06/15 11:08, Erik Faye-Lund wrote:
>>> On Thu, Jun 25, 2015 at 1:48 AM, Davin McCall <davmac at davmac.org> wrote:
>>>> This is an alternative to my earlier patch [1] (and it is now constructed
>>>> properly using git format-patch).
>>>>
>>>> Quick background:
>>>> There is a problem in exec_list due to it directly including a trio
>>>> of 'struct exec_node *' members to implement two overlapping sentinel
>>>> nodes. The sentinel nodes do not exist as exec_node objects and so
>>>> should not be accessed as such, according to C99 6.5 paragraph 7.
>>>> When this strict aliasing rule is violated the compiler may re-order
>>>> reads and writes in unexpected ways, such as demonstrated in another
>>>> email [2].
>>>>
>>>> The problem only manifests if compiling without -fno-strict-aliasing.
>>>>
>>>> This patch addresses the issue by introducing some new methods for
>>>> setting the 'next' and 'prev' members of the exec_node structure, which
>>>> avoid the aliasing restrictions by way of casting the exec_node pointer
>>>> (to an exec_node-pointer-pointer) whenever it may actual point to a
>>>> sentinel node. Essentially an exec_node can be seen as an array of two
>>>> exec_node pointers, and this view is compatible with the sentinel
>>>> structure in exec_list.
>>>>
>>>> Compared to the previous patch, this patch is much less intrusive, and
>>>> does not increase the storage requirements of the exec_list structure.
>>>>
>>>> While I'm not proposing that -fno-strict-aliasing no longer be used for
>>>> Mesa builds, this patch represents a step in that direction. With this
>>>> patch applied, a working Mesa library can be built, although bugs may
>>>> be present (and could be triggered only when using particular compiler
>>>> versions and options). FWIW file sizes with and without strict aliasing:
>>>>
>>>> (gcc 4.8.4, -O3 -fomit-frame-pointer -march=corei7).
>>>>
>>>> -fno-strict-aliasing: with strict aliasing:
>>>> libGL.so 699188 699188 (no change)
>>>> *_dri.so 9575876 9563104 (-2772)
>>>>
>>>> (dri bundle includes r300, r600, kms_swrast and swrast).
>>>>
>>>> So, not a huge win, size-wise. Dave Airlie reports a 30K difference in
>>>> his r600_dri.so build however [3].
>>>>
>>>> In terms of performance:
>>>>
>>>> (export LIBGL_ALWAYS_SOFTWARE=1; time glmark2)
>>>>
>>>> -fno-strict-aliasing:
>>>>
>>>> glmark2 Score: 244
>>>> real 5m34.707s
>>>> user 11m36.192s
>>>> sys 0m29.596s
>>>>
>>>> with strict aliasing:
>>>>
>>>> glmark2 Score: 247
>>>> real 5m34.438s
>>>> user 11m29.904s
>>>> sys 0m29.556s
>>>>
>>>> Again, only a very small improvement when strict aliasing is enabled.
>>>>
>>>> With the above in mind it is reasonable to question whether this patch
>>>> is worthwhile. However, it's done, and it seems to work, so I offer it
>>>> for review.
>>>>
>>>>
>>>> [1] http://lists.freedesktop.org/archives/mesa-dev/2015-June/087179.html
>>>> [2] http://lists.freedesktop.org/archives/mesa-dev/2015-June/087246.html
>>>> [3] http://lists.freedesktop.org/archives/mesa-dev/2015-June/087206.html
>>>> ---
>>>> src/glsl/list.h | 123
>>>> ++++++++++++++++++++++++++++++++++++--------------------
>>>> 1 file changed, 80 insertions(+), 43 deletions(-)
>>>>
>>>> diff --git a/src/glsl/list.h b/src/glsl/list.h
>>>> index 15fcd4a..cfbe5a9 100644
>>>> --- a/src/glsl/list.h
>>>> +++ b/src/glsl/list.h
>>>> @@ -76,6 +76,12 @@
>>>> #include "util/ralloc.h"
>>>>
>>>> struct exec_node {
>>>> + /**
>>>> + * Accessing these fields directly may be ill-advised; if the
>>>> 'exec_node'
>>>> + * is actually a sentinel node embedded in the exec_list structure, it
>>>> may
>>>> + * be a strict-aliasing violation (C99 6.5 paragraph 7). Use the methods
>>>> + * provided instead.
>>>> + */
>>>> struct exec_node *next;
>>>> struct exec_node *prev;
>>>>
>>>> @@ -140,35 +146,55 @@ exec_node_init(struct exec_node *n)
>>>> n->prev = NULL;
>>>> }
>>>>
>>>> +/**
>>>> + * Strict-aliasing safe method for setting the next pointer for any
>>>> + * node, including sentinel nodes.
>>>> + */
>>>> +static inline void
>>>> +exec_node_set_next(struct exec_node *n, struct exec_node *next)
>>>> +{
>>>> + ((struct exec_node **)n)[0] = next;
>>>> +}
>>>> +
>>>> +/**
>>>> + * Strict-aliasing safe method for setting the next pointer for any
>>>> + * node, including sentinel nodes.
>>>> + */
>>>> +static inline void
>>>> +exec_node_set_prev(struct exec_node *n, struct exec_node *next)
>>>> +{
>>>> + ((struct exec_node **)n)[1] = next;
>>>> +}
>>>> +
>>>> static inline const struct exec_node *
>>>> exec_node_get_next_const(const struct exec_node *n)
>>>> {
>>>> - return n->next;
>>>> + return ((const struct exec_node **)n)[0];
>>>> }
>>> How exactly is this supposed to be strict-aliasing safe?
>>>
>>> Here's the wording from the C++14 spec:
>>>
>>> "If a program attempts to access the stored value of an object through
>>> a glvalue of other than one of the following types the behavior is
>>> undefined:
>>> * the dynamic type of the object,
>>> * a cv-qualified version of the dynamic type of the object,
>>> * a type similar (as defined in 4.4) to the dynamic type of the object,
>>> * a that is the signed or unsigned type corresponype tding to the
>>> dynamic type of the object,
>>> * a type that is the signed or unsigned type corresponding to a
>>> cv-qualified version of the dynamic type of the object,
>>> * an aggregate or union type that includes one of the aforementioned
>>> types among its elements or non-static data members (including,
>>> recursively, an element or non-static data member of a subaggregate or
>>> contained union),
>>> * a type that is a (possibly cv-qualified) base class type of the
>>> dynamic type of the object,
>>> * a char or unsigned char type."
>>>
>>> You cast an 'struct exec_node *' to 'struct exec_node **', and read
>>> it. Those are different types, and are not among the exceptions listed
>>> above...
>>
>> In for eg:
>>
>> return ((const struct exec_node **)n)[0]
>>
>> ... The stored value of 'n' is not accessed by any other type than the
>> type of n itself. This value is then cast to a different pointer type.
>> You are mistaken if you think that the cast accesses the stored value of
>> n. The other "stored value" access that it occurs in that expression is
>> to the object pointed at by the result of the cast. As the exec_node
>> structure is defined as:
>>
>> struct exec_node *next;
>> struct exec_node *prev;
>>
>> ... then the value being accessed is the 'next' member of the exec_node
>> struct, not the exec_node struct itself. I.e. store value of type
>> 'struct exec_node *' is being accessed via a glvalue of that same type.
>>
>> Putting it another way, casting some pointer p, of type 'struct
>> exec_node *', to 'struct exec node **', is effectively the same as '&
>> p->next', but without the de-reference of p (which is the potential an
>> aliasing violation that the patch avoids).
>>
>>
>> Although I don't think this is your concern, the cast is valid due to:
>> C99 6.7.2.1 p13:
>> "[...]. A pointer to a structure object, suitably converted, points to
>> its initial member (or if that member is a
>> bit-field, then to the unit in which it resides), and vice versa."
>>
>> I don't have C++14, so I've checked C++11. I'm less familiar with the
>> C++ standard text in general, so I'm not sure where exactly the
>> equivalent to the C99 is, but I am quite certain that the rules are
>> essentially the same.
>>
>> Perhaps this snippet is all that's really required:
>>
>> C++11 1.8: "Unless an object is a bit-field or a base class subobject of
>> zero size, the address of that object is the address
>> of the first byte it occupies."
>>
>> There's also 4.10p2:
>>
>> "A prvalue of type “pointer to cv T,” where T is an object type, can be
>> converted to a prvalue of type “pointer to cv void”. The result of
>> converting a “pointer to cv T” to a “pointer to cv void” points to the
>> start of the storage location where the object of type T resides [...]"
>>
>> ... which to some extent implies that maybe you should cast first to
>> "void *" then to the member type, but in practice this isn't necessary.
>>
> That may be right, but it still seems kind of difficult to enforce that
> other users of the linked list won't end up breaking the strict aliasing
> rule by accident unless we change its data structures completely
> (e.g. you still have exec_list objects pointed to by exec_node pointers
> what makes it really tempting for other users to bypass the wrappers you
> have introduced and break the rule again).
Yes. I believe there are such cases in the code already; the patch does
not set out to correct all the aliasing violations. Such a patch would
be quite large. This was an attempt to make some progress, not resolve
the problem completely.
> Another issue is that AFAICT this introduces a new kind of UB e.g. when
> you do "((const struct exec_node **)n)[1]". The layout of the exec_node
> (or exec_list) object "n" points to is unspecified, and the array
> subscript past the end of the pointed-to object is undefined because of
> section 6.5.6 of the C99 spec regarding the additive operators:
>
> | 7 For the purposes of these operators, a pointer to an object that is
> | not an element of an array behaves the same as a pointer to the
> | first element of an array of length one with the type of the object
> | as its element type.
> |
> | 8 [..] If the result points one past the last element of the array
> | object, it shall not be used as the operand of a unary * operator
> | that is evaluated.
>
> (Array subscripting is later on defined in terms of the '+' and '*'
> operators)
It's true that this is what the standard says, but this is one of those
cases where actual compiler implementations are more permissive than the
standard, as a matter of practicality.
That being said, the code could be made to work without the array
subscripting by casting first to 'char *' before doing the pointer
arithmetic, and using 'offsetof' to be certain of the correct address of
the next/prev members.
> That said, this problem is IMHO worth fixing regardless of the
> performance improvement, this linked list implementation relies on UB
> which we work around by passing a non-standard compiler flag to, AFAIK,
> only GCC and Clang, there's no guarantee that other compilers won't take
> advantage of the strict aliasing rule and miscompile this code.
I agree.
> We may
> just as well replace this linked list with another of the many
> implementations we have, as someone else suggested earlier.
The main problem is that the exec_node structure allows making a
determination of whether a node is at the head or the end of the list
without requiring a reference to the list structure itself. The list
implementation in src/util/list.h is on the other hand a simple
doubly-linked list where the list head is also the list tail and neither
can be distinguished from other list nodes (for exec_node, the next
pointer will be NULL in the tail sentinel, and the prev pointer will be
NULL in the head sentinel).
The amount of effort required to use the src/util/list.h implementation
wherever the exec_list is currently used would be large, and would
probably incur a memory overhead because in various places you would
need to store a reference to the list as well as the node where you
previously only needed a reference to the node.
The other alternative would be to make src/util/list.h use the exec_list
implementation, and hope that there's no code that requires on the list
being circular. You'd still have the aliasing violation to contend with,
too, unless you use my patch. So, even with this as an ultimate goal,
the patch I've provided makes progress. If there was concern about the
casting hoops necessary to avoid strict aliasing, the alternative would
be to use version 1 of the patch, which split the head sentinel and tail
sentinel into two nodes.
Davin
More information about the mesa-dev
mailing list