[Mesa-dev] [Mesa-stable] [PATCH] util/queue: fix a race condition in the fence code

Gustaw Smolarczyk wielkiegie at gmail.com
Thu Sep 28 18:31:50 UTC 2017


2017-09-28 20:21 GMT+02:00 Nicolai Hähnle <nhaehnle at gmail.com>:
> On 28.09.2017 19:18, Gustaw Smolarczyk wrote:
>>
>> 2017-09-28 18:52 GMT+02:00 Marek Olšák <maraeo at gmail.com>:
>>>
>>> A clearer comment would be: "Don't destroy the fence when it's in the
>>> middle of util_queue_fence_signal (signalled but not unlocked yet
>>> because util_queue_fence_is_signalled doesn't lock). Instead, wait
>>> until util_queue_fence_signal returns and then destroy it."
>
>
> Sure, I can change that.
>
> [snip]
>
>> What about the ones that wait in util_queue_fence_wait? I think it's
>> possible for the following:
>>
>> -----------------------
>>
>>           util_queue_fence_wait (begin)
>>
>>           ...
>>
>>           cnd_wait(&fence->cond, &fence->mutex);
>> util_queue_fence_signal (begin)
>> mtx_lock(&fence->mutex);
>> fence->signalled = true;
>> cnd_broadcast(&fence->cond);
>> mtx_unlock(&fence->mutex);
>> util_queue_fence_signal (end)
>>                                          util_queue_fence_is_signalled
>>                                          util_queue_fence_destroy (begin)
>>                                          mtx_lock(&fence->mutex);
>>                                          mtx_unlock(&fence->mutex);
>>                                          cnd_destroy(&fence->cond);
>>                                          mtx_destroy(&fence->mutex);
>>                                          util_queue_fence_destroy (end)
>>
>>           mtx_unlock(&fence->mutex); <--- use after free
>>
>>           util_queue_fence_wait (end)
>> -----------------------
>>
>> Unless it is defined that the race between mtx_unlock and mtx_destroy
>> is automagically fixed (like mtx_destroy performs a lock operation
>> inside of itself; pthread_mutex_* don't do that IIUC).
>
>
> If I'm understanding you correctly, there are three threads in play here.
> The question is: In your example, what is the contract between the different
> threads?
>
> The thread which calls util_queue_fence_destroy is responsible for ensuring
> that no other thread accesses the fence anymore.
>
> The contract between the util_queue and its users is that after signaling
> the fence of a job, the queue will not touch that job anymore. So the user
> of util_queue can destroy the job (and hence its fence) after the fence is
> signaled.
>
> I'm pretty sure your example cannot possibly make sense. You have to have
> some synchronization external to u_queue.c between the thread running
> util_queue_fence_wait and the thread running util_queue_fence_destroy to
> ensure that this doesn't happen.

Right. I was just taking a look without considering the usage. If the
thread that destroys the fence takes responsibility for it to never be
waited upon ever again (for example by being the only thread that ever
waits for it and stopping doing that after the destroy) then I believe
no problematic schedule could happen that would still exercise the
use-after-free issue. I see now that only the race between
wait+destroy and signal should be considered and this patch fixes it.

FWIW,
Reviewed-by: Gustaw Smolarczyk <wielkiegie at gmail.com>

Regards,
Gustaw

> Cheers,
> Nicolai
>
>>
>> Regards,
>> Gustaw
>>
>
>
> --
> Lerne, wie die Welt wirklich ist,
> Aber vergiss niemals, wie sie sein sollte.


More information about the mesa-dev mailing list