[Nouveau] [PATCH 2/2] nouveau: Do not add most bo's to the global bo list.
Patrick Baggett
baggett.patrick at gmail.com
Wed Feb 25 08:28:22 PST 2015
So you're saying a compiler can optimize:
>
> - statement with memory access
> - read memory barrier
> - statement with memory access
>
> To drop the second statement? I would worry about your definition of
> memory barrier then..
>
This is tricky, but I think you're mixing up the general case with the
actual code you have. In general, you are pointing to this example:
- Read memory location 1 (asm level)
- RMB
- Read memory location 2 (asm level)
In this case, you are precisely correct, the compiler will not reorder the
reads, and the CPU won't either.
What you actually have is this:
- Access memory location 1 (C code)
- RMB
- Acesss memory location 1 (C code)
I say "access" here because looking at the C code, not assembly. You have
this:
read nvbo->head.next
RMB
read nvbo->head.next
This only works if you can guarantee that the compiler will actually
generate a MOV instruction for the second read. If the variable isn't
marked as "volatile", the compiler has no requirement to. Barriers control
ordering and when a variable can be loaded, but you've already loaded it.
You need to reload the variable, but you haven't shown the compiler a
reason why it should. "Volatile" is a reason why.
I tested this code on gcc:
if(globalFlag) {
math...
if(globalFlag) {
more math
}
}
It only tests once. If you replace "math" with printf(), it retests the
value of globalFlag. Printf(), which has absolutely no barrier semantics at
all. In other words, GCC conservatively will assume any function call will
require a reload because it could modify anything. The code will work as-is.
Obviously, pthread_mutex_lock() does not modify the value of
`nvbo->head.next`. Once GCC knows that (and maybe other compilers already
do), perhaps with LTO, it doesn't have to generate a second load. If you
ever switch to an inlined lock or different compiler that understands that,
your code will break. I suggest you simply do the right thing the first
time and cast the pointer to volatile since you're treating the value as if
it is volatile (i.e. changing outside of the scope of the compiler's
knowledge).
Patrick
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.freedesktop.org/archives/nouveau/attachments/20150225/83c32418/attachment.html>
More information about the Nouveau
mailing list