[Mesa-dev] [PATCH] gallium/os: add os_wait_until_zero
Ilia Mirkin
imirkin at alum.mit.edu
Fri Jun 26 09:45:00 PDT 2015
On Fri, Jun 26, 2015 at 12:40 PM, Marek Olšák <maraeo at gmail.com> wrote:
> If p_atomic_read is fine, then this patch is fine too. So you're
> telling that this should work:
>
> while (p_atomic_read(var));
No, this shouldn't work. I don't believe that anyone ever claimed it
ought to. But perhaps we have different ideas of what "p_atomic_read"
should do. I see it as "atomically read a word from memory in a way
that you can never get partial results". The compiler can still
happily optimize this to
x = p_atomic_read(var) (assuming p_atomic_read isn't an out-of-line function)
while (x);
>
> I wouldn't be concerned about a memory barrier. This is only 1 int, so
> it should make its way into the shared cache eventually.
The barrier is for the compiler to know what's going on, not for any
hardware reasons. I *believe* this is the barrier() macro in the
kernel (mb() is an actual memory barrier issued on the CPU).
So something like
#define barrier() __asm__ __volatile__("": : :"memory")
while (*var) {
barrier();
}
Should have the desired effect. [Or while (p_atomic_read(var)) ... whatever]
>
> Marek
>
>
> On Fri, Jun 26, 2015 at 6:25 PM, Ilia Mirkin <imirkin at alum.mit.edu> wrote:
>> p_atomic_read is fine as-is. The read is guaranteed to be atomic up to
>> a word size on x86 processors. I suspect other cpu's have similar
>> guarantees, and if not, then hopefully have other ops to perform the
>> read atomically.
>>
>> On Fri, Jun 26, 2015 at 12:18 PM, Marek Olšák <maraeo at gmail.com> wrote:
>>> My question was how to fix p_atomic_read? Would the volatile read that
>>> I proposed work?
>>>
>>> Marek
>>>
>>>
>>> On Fri, Jun 26, 2015 at 5:59 PM, Ilia Mirkin <imirkin at alum.mit.edu> wrote:
>>>> The compiler doesn't know that there's another thread. And it won't
>>>> start to assume that there might always be another thread because then
>>>> it could never optimize pointer derefs.
>>>>
>>>> On Fri, Jun 26, 2015 at 11:55 AM, Marek Olšák <maraeo at gmail.com> wrote:
>>>>> I assumed the atomic operation in another thread would act as a
>>>>> barrier in this case. Is that right?
>>>>>
>>>>> Marek
>>>>>
>>>>> On Fri, Jun 26, 2015 at 5:47 PM, Ilia Mirkin <imirkin at alum.mit.edu> wrote:
>>>>>> On Fri, Jun 26, 2015 at 11:33 AM, Marek Olšák <maraeo at gmail.com> wrote:
>>>>>>> I expect the variable will be changed using an atomic operation by the
>>>>>>> CPU, or using a coherent store instruction by the GPU.
>>>>>>>
>>>>>>> If this is wrong and volatile is really required here, then
>>>>>>> p_atomic_read is wrong too. Should we fix it? For example:
>>>>>>>
>>>>>>> #define p_atomic_read(_v) (*(volatile int*)(_v))
>>>>>>>
>>>>>>> Then, os_wait_until_zero can use p_atomic_read.
>>>>>>
>>>>>> The problem is the lack of a memory barrier. A function call to
>>>>>> sched_yield implies a barrier (since the function could well have
>>>>>> changed the memory that var points to), and so it's all fine. But on a
>>>>>> non-PIPE_OS_UNIX os, this becomes
>>>>>>
>>>>>> while (*var);
>>>>>>
>>>>>> Which the compiler would be well within its right to rewrite to
>>>>>>
>>>>>> x = *var;
>>>>>> while (x);
>>>>>>
>>>>>> which wouldn't end well. You need a barrier that tells the compiler
>>>>>> that memory may be invalidated... I believe this maps to either mb()
>>>>>> or barrier() in the linux kernel, and they have some fancy assembly
>>>>>> syntax to tell the compiler "oops, memory may have changed". However
>>>>>> in this case, it may be easier to just mark the var volatile.
>>>>>>
>>>>>>>
>>>>>>> Marek
>>>>>>>
>>>>>>> On Fri, Jun 26, 2015 at 4:48 PM, Ilia Mirkin <imirkin at alum.mit.edu> wrote:
>>>>>>>> On Fri, Jun 26, 2015 at 7:05 AM, Marek Olšák <maraeo at gmail.com> wrote:
>>>>>>>>> From: Marek Olšák <marek.olsak at amd.com>
>>>>>>>>>
>>>>>>>>> This will be used by radeon and amdgpu winsyses.
>>>>>>>>> Copied from the amdgpu winsys.
>>>>>>>>> ---
>>>>>>>>> src/gallium/auxiliary/os/os_time.c | 36 +++++++++++++++++++++++++++++++++++-
>>>>>>>>> src/gallium/auxiliary/os/os_time.h | 10 ++++++++++
>>>>>>>>> 2 files changed, 45 insertions(+), 1 deletion(-)
>>>>>>>>>
>>>>>>>>> diff --git a/src/gallium/auxiliary/os/os_time.c b/src/gallium/auxiliary/os/os_time.c
>>>>>>>>> index f7e4ca4..63b6879 100644
>>>>>>>>> --- a/src/gallium/auxiliary/os/os_time.c
>>>>>>>>> +++ b/src/gallium/auxiliary/os/os_time.c
>>>>>>>>> @@ -33,11 +33,12 @@
>>>>>>>>> */
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> -#include "pipe/p_config.h"
>>>>>>>>> +#include "pipe/p_defines.h"
>>>>>>>>>
>>>>>>>>> #if defined(PIPE_OS_UNIX)
>>>>>>>>> # include <time.h> /* timeval */
>>>>>>>>> # include <sys/time.h> /* timeval */
>>>>>>>>> +# include <sched.h> /* sched_yield */
>>>>>>>>> #elif defined(PIPE_SUBSYSTEM_WINDOWS_USER)
>>>>>>>>> # include <windows.h>
>>>>>>>>> #else
>>>>>>>>> @@ -92,3 +93,36 @@ os_time_sleep(int64_t usecs)
>>>>>>>>> }
>>>>>>>>>
>>>>>>>>> #endif
>>>>>>>>> +
>>>>>>>>> +
>>>>>>>>> +bool os_wait_until_zero(int *var, uint64_t timeout)
>>>>>>>>
>>>>>>>> Does this need to be volatile?
>>>>>>>>
>>>>>>>>> +{
>>>>>>>>> + if (!*var)
>>>>>>>>> + return true;
>>>>>>>>> +
>>>>>>>>> + if (!timeout)
>>>>>>>>> + return false;
>>>>>>>>> +
>>>>>>>>> + if (timeout == PIPE_TIMEOUT_INFINITE) {
>>>>>>>>> + while (*var) {
>>>>>>>>> +#if defined(PIPE_OS_UNIX)
>>>>>>>>> + sched_yield();
>>>>>>>>> +#endif
>>>>>>>>> + }
>>>>>>>>> + return true;
>>>>>>>>> + }
>>>>>>>>> + else {
>>>>>>>>> + int64_t start_time = os_time_get_nano();
>>>>>>>>> + int64_t end_time = start_time + timeout;
>>>>>>>>> +
>>>>>>>>> + while (*var) {
>>>>>>>>> + if (os_time_timeout(start_time, end_time, os_time_get_nano()))
>>>>>>>>> + return false;
>>>>>>>>> +
>>>>>>>>> +#if defined(PIPE_OS_UNIX)
>>>>>>>>> + sched_yield();
>>>>>>>>> +#endif
>>>>>>>>> + }
>>>>>>>>> + return true;
>>>>>>>>> + }
>>>>>>>>> +}
>>>>>>>>> diff --git a/src/gallium/auxiliary/os/os_time.h b/src/gallium/auxiliary/os/os_time.h
>>>>>>>>> index 4fab03c..fdc8040 100644
>>>>>>>>> --- a/src/gallium/auxiliary/os/os_time.h
>>>>>>>>> +++ b/src/gallium/auxiliary/os/os_time.h
>>>>>>>>> @@ -94,6 +94,16 @@ os_time_timeout(int64_t start,
>>>>>>>>> }
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> +/**
>>>>>>>>> + * Wait until the variable at the given memory location is zero.
>>>>>>>>> + *
>>>>>>>>> + * \param var variable
>>>>>>>>> + * \param timeout timeout, can be anything from 0 (no wait) to
>>>>>>>>> + * PIPE_TIME_INFINITE (wait forever)
>>>>>>>>> + * \return true if the variable is zero
>>>>>>>>> + */
>>>>>>>>> +bool os_wait_until_zero(int *var, uint64_t timeout);
>>>>>>>>> +
>>>>>>>>> #ifdef __cplusplus
>>>>>>>>> }
>>>>>>>>> #endif
>>>>>>>>> --
>>>>>>>>> 2.1.0
>>>>>>>>>
>>>>>>>>> _______________________________________________
>>>>>>>>> mesa-dev mailing list
>>>>>>>>> mesa-dev at lists.freedesktop.org
>>>>>>>>> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
More information about the mesa-dev
mailing list