[Bug 750397] CRITICAL: Race condition in GstBus

GStreamer (GNOME Bugzilla) bugzilla at gnome.org
Mon Jun 27 16:04:24 UTC 2016


https://bugzilla.gnome.org/show_bug.cgi?id=750397

--- Comment #45 from Sebastian Dröge (slomo) <slomo at coaxion.net> ---
One thing that this still does not solve without e.g. my GstBus patch is
explained above: control_pending==0, release_wakeup(), control_pending goes to
-1, we don't read/block but instead return TRUE as if reading was successful.
And next raise_wakeup() would also do nothing as control_pending==-1. For
consistent behaviour the release_wakeup() would've had to block though as it
does otherwise.

If you always write() if control_pending<=0 before incrementing it, and always
read() if control_pending<=0 after decrementing this should be solved for the
case of blocking sockets. For Windows this would still not work.


(In reply to Matt Gruenke from comment #32)

> Just as without the patch:
> 
> 0. set->control_pending is initially 0.
> 1. Writer increments set->control_pending
> 2. Reader decrements set->control_pending
> 3. Reader gets the mutex, finds that set->control_pending_locked is
>    still 0, and skips the read.
> 4. Writer gets the mutex, increments set->control_pending_locked, and
>    performs the write.

Not exactly:

3. Reader gets the mutex, --control_pending_locked == -1 (!) so reading is
skipped
4. Writer gets the mutex, control_pending_locked++ == -1 (!) so writing is
skipped

It's however exactly the wrong case I mentioned above for your patch.


Note however that this situation can't happen together with my GstBus patch in
both cases, or in your patch (that is, the combination of them... please attach
complete patches in the future) if you also check for negative numbers.

> > Nobody noticed that GstBus was broken like this for more than 7 years (if my
> > GIT archaeology was correct), and even now that we know the problem it only
> > happens for a few people. I don't think "it works for me" is a good argument
> > in this case :)
> 
> I don't know when it was introduced, but we didn't encounter the bug in
> gstreamer-0.10.25, which we’d been using until last year.  When we upgraded
> to 1.2.4, we noticed it quite easily.

In 0.10, GstBus did not use the lockless stuff.


If I'm not mistaken, the only thing that your patch brings additionally over
mine is that more mutexes are taken, and that the explicit gst_poll_wait()
(that can have a timeout) is replaced by an implicit blocking (that can't have
a timeout) in read_control(). I don't like neither btw, which is why I propose
to get away from the poll stuff here (which does not really bring us anything
anyway once we add the mutex).


What I'm mostly worried about here is that we end up in a situation where we
actually block without any way of interruption, which previously was just an
error and was reported as such (and ignored by GstBus).

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.


More information about the gstreamer-bugs mailing list