[Bug 750397] CRITICAL: Race condition in GstBus

GStreamer (GNOME Bugzilla) bugzilla at gnome.org
Fri Jun 24 03:48:30 UTC 2016


https://bugzilla.gnome.org/show_bug.cgi?id=750397

--- Comment #29 from Matt Gruenke <mgruenke at tycoint.com> ---
(In reply to Sebastian Dröge (slomo) from comment #28)
> I think except for GstBufferPool (bug #767979), this is all fixed now.

I disagree.  I think your diff solves nothing, as you're acquiring the mutex
too late.  The entire problem comes down to an emergent discrepancy between
set->control_pending and the pipe occupancy.

If a mutex were to solve this problem, you'd need to lock it before
testing/modifying set->control_pending.  And then, what would be the point of
set->control_pending_locked?

But, I'm not sure even that would fix the problem, so long as gstpoll.c
contains the following:

    fcntl (control_sock[0], F_SETFL, O_NONBLOCK);
    fcntl (control_sock[1], F_SETFL, O_NONBLOCK);

Because, even if a mutex is locked before set->control_pending is
tested/modified, is the pipe guaranteed to be atomic?  In other words, if one
thread reads the set->control_read_fd.fd immediately after another thread's
write has returned, is the reader guaranteed to get the byte?  On all platforms
that support socketpair()?  If not, no mutex would completely avoid the need to
make it blocking.

However, if you simply make the control pipe block, then it solves the
inconsistency between the pipe and the atomic counter, and you can even remove
the mutex locking you just added.

Consider the following sequence:

0. set->control_pending is currently 0
1. Writer increments set->control_pending
2. Reader decrements set->control_pending
3. Reader performs a read on set->control_read_fd.fd and blocks
4. Write writes set->control_write_fd.fd
5. Reader unblocks and returns

Consistency between set->control_pending and the pipe are maintained.  And
without any mutex locking.

The reason I haven't posted a patch is that I didn't modify gstpoll.c.  For my
purposes, I just reached inside the GstBus, before we ever read it, and ensure
that O_NONBLOCK has been cleared from its poll instance.

As I said, I've got probably thousands of pipeline years of runtime on this
fix, and probably hundreds of thousands of pipeline setups & teardowns.  Given
how quickly I encountered the original problem (within days and maybe a few
hundred pipeline setups/teardows), any inadequacies of my fix should've become
apparent, by now.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.


More information about the gstreamer-bugs mailing list