<html>
<head>
<meta content="text/html; charset=utf-8" http-equiv="Content-Type">
</head>
<body bgcolor="#FFFFFF" text="#000000">
<div class="moz-cite-prefix">On 24/06/15 03:35, Ian Romanick wrote:<br>
</div>
<blockquote cite="mid:558A1762.5000908@freedesktop.org" type="cite">I'd
also like to see assembly dumps with and without
-fno-strict-aliasing of some place where this goes wrong. If you,
Davin, or someone else can point out a specific function that
actually
does the wrong thing, I can generate assembly myself.
For that matter... how the heck is the ralloc (or any memory
allocator)
implementation valid?<br>
</blockquote>
<br>
Ok, got one. In src/glsl/link_varying.cpp, function
"canonicalize_shader_io". I compiled with gcc 4.8.4 on x86, with
CFLAGS="-O3 -fomit-frame-pointer -march=corei7". The method has a
loop that looks like this:<br>
<br>
<br>
<tt> /* Remove the variable from it's current location in the IR,
and put it at</tt><tt><br>
</tt><tt> * the front.</tt><tt><br>
</tt><tt> */</tt><tt><br>
</tt><tt> for (unsigned i = 0; i < num_variables; i++) {</tt><tt><br>
</tt><tt> var_table[i]->remove();</tt><tt><br>
</tt><tt> ir->push_head(var_table[i]);</tt><tt><br>
</tt><tt> }</tt><tt><br>
</tt><br>
<br>
With -fno-strict-aliasing (i.e. "correctly compiled") this becomes
(with my comments added):<br>
<br>
<blockquote><tt> .p2align 4,,10</tt><tt><br>
</tt><tt> .p2align 3<br>
# at this point: %ecx = var_table</tt><tt>, %eax = ir,<br>
# %edi = var_table + num_variables<br>
</tt><tt>.L59:</tt><tt><br>
</tt><tt> movl (%ecx), %edx </tt><tt># %edx =
var_table[i]</tt><tt><br>
# call to remove() is inlined beginning here:<br>
</tt><tt> movl 4(%edx), %ebp # %ebp =
(%edx)->next</tt><tt><br>
</tt><tt> movl 8(%edx), %esi # %esi =
(%edx)->prev</tt><tt><br>
</tt><tt> movl %esi, 4(%ebp) # (%edx)->next->prev
= (%edx)->prev</tt><tt><br>
</tt><tt> movl 4(%edx), %esi # %esi =
(%edx)->next</tt><tt> (reload)<br>
</tt><tt> movl 8(%edx), %ebp</tt><tt> # %ebp =
(%edx)->prev (reload)<br>
</tt><tt> movl %esi, 0(%ebp)</tt><tt> #
(%edx)->prev->next = (%edx)->next<br>
</tt><tt> movl $0, %esi</tt><tt><br>
</tt><tt> movl $0, 4(%edx)</tt><tt> #
(%edx)->next = NULL<br>
</tt><tt> movl $0, 8(%edx) # (%edx)->prev =
NULL</tt><tt><br>
# inlined call to remove() completes.<br>
</tt><tt> movl (%ecx), %ebp</tt><tt> # %ebp =
var_table[i]<br>
</tt><tt> leal 4(%ebp), %edx # %edx = (struct
exec_node *)(%ebp)</tt><tt><br>
</tt><tt> testl %ebp, %ebp</tt><tt> # NULL check(?!)<br>
</tt><tt> movl (%eax), %ebp </tt><tt># Note that
%eax = ir: %ebp=ir->head<br>
</tt><tt> cmove %esi, %edx</tt><tt> # %edx = NULL
if var_table[i]==NULL<br>
</tt><tt> addl $4, %ecx # (i++)</tt><tt><br>
</tt><tt> cmpl %edi, %ecx # (i <
num_variables?</tt><tt>)<br>
# ir->push_head(var_table[i])<br>
</tt><tt> movl %eax, 4(%edx) # </tt><tt>var_table[i]->prev
= & ir->head<br>
</tt><tt> movl %ebp, (%edx) # var_table[i]->next
= ir->head</tt><tt><br>
</tt><tt> movl %edx, 4(%ebp) # ir->head->prev
= var_table[i]</tt><tt><br>
</tt><tt> movl %edx, (%eax) </tt><tt># ir->head
= var_table[i]<br>
</tt><tt> jne .L59</tt><tt><br>
</tt></blockquote>
<br>
This more-or-less makes sense (there is an unnecessary NULL check
generated by the implicit type conversion of ir_variable* to
exec_node*). Now, without -fno-strict-aliasing it's:<br>
<br>
<blockquote><tt> jmp .L59 </tt><tt># note this jump<br>
</tt><tt> .p2align 4,,10</tt><tt><br>
</tt><tt> .p2align 3</tt><tt><br>
</tt><tt>.L74:</tt><tt><br>
</tt><tt> movl %esi, %edi</tt><tt><br>
# %ecx = var_table, %eax = ir, %edi = ir->head<br>
</tt><tt>.L59:</tt><tt><br>
</tt><tt> # inlined remove():<br>
movl (%ecx), %edx </tt><tt># %edx = var_table[i]<br>
</tt><tt> movl 4(%edx), %esi # %esi = (%edx)->next</tt><tt><br>
</tt><tt> movl 8(%edx), %ebp # %ebp = (%edx)->prev</tt><tt><br>
</tt><tt> movl %ebp, 4(%esi) #
(%esi)->next->prev = </tt><tt>(%edx)->prev<br>
</tt><tt> movl 8(%edx), %ebp</tt><tt> # re-load
(%edx)->prev<br>
# in this case reload of ->next is omitted<br>
</tt><tt> movl %esi, 0(%ebp) #</tt><tt>
(%edx)->prev->next = (%edx)->next<br>
</tt><tt> movl $0, 4(%edx) </tt><tt># (%edx)->next
= NULL<br>
</tt><tt> movl $0, 8(%edx) # (%edx)->prev = NULL</tt><tt><br>
<br>
</tt><tt> movl (%ecx), %edx # %edx = </tt><tt>var_table[i]<br>
</tt><tt> leal 4(%edx), %esi</tt><tt> # %esi = (struct
exec node *)%edx<br>
</tt><tt> testl %edx, %edx # NULL check</tt><tt><br>
</tt><tt> movl $0, %edx </tt><tt><br>
</tt><tt> cmove %edx, %esi</tt><tt><br>
</tt><tt> addl $4, %ecx</tt><tt> # (i++)<br>
</tt><tt> cmpl 28(%esp), %ecx</tt><tt> # (i < num_variables?)<br>
</tt><tt> movl %edi, (%esi) # var_table[i]->next =
ir->head</tt><tt><br>
</tt><tt> movl %eax, 4(%esi)</tt><tt> #
var_table[i]->prev = & ir->head<br>
</tt><tt> movl %esi, 4(%edi) #</tt><tt>
ir->head->prev = var_table[i]<br>
</tt><tt> jne .L74</tt><tt><br>
</tt><tt> movl %esi, (%eax) # ir->head =
var_table[i]</tt><br>
</blockquote>
<br>
Note that it doesn't update ir->head in memory until right at the
end of the loop (i.e. it collapses all writes to ir->head into a
single write). The current value of ir->head is kept in %edi, and
is neither stored in nor re-loaded from memory during the loop. This
is the problem: the line that sets "(%edx)->prev->next =
(%edx)->next" can in fact store to ir->head (which is an
aliasing violation), so it does need to be reloaded after that line
(and stored before it).<br>
<br>
Davin<br>
<br>
<br>
</body>
</html>