[pulseaudio-discuss] And now, something is wrong with the flist implementation...
tanuk at iki.fi
Fri Mar 2 07:45:01 PST 2012
On Fri, 2012-03-02 at 13:49 +0100, David Henningsson wrote:
> I've had my eyes on a couple of bugs filed in Ubuntu, where PulseAudio
> crashes with SIGABRT somewhere inside pa_memblock_* or pa_memblockq_*,
> often after several hours of playback. I now got a core dump from the
> bug reporter (thanks, Eric Casteleijn!) and I started analysing it.
> It turns out that the same memblockq list_items is used as part of two
> memblockq chains, in essence for one of the blocks q->next->prev != q.
> How could this be? Well, they seem to be allocated from a static flist
> pool. I've never looked at flist before, but it claims to be, "A
> multiple-reader multipler-write lock-free free list implementation".
> I had a look at tests/flist-test.c and decided to rewrite the test case
> to make it a bit tougher and to see if I could verify that it never
> handed out the same block twice. The test seemed to pass at first, but
> the ~10th time I ran it, the code hung. All 20 threads accessing the
> flist seemed to have stalled. I attached to the process with gdb, and
> found that l->stored, l->empty, and l->stored->next  were all equal,
> so clearly something is wrong.
> Instead of trying to verify the algorithm, I went to Google to look for
> a reference implementation to compare against, and quickly found .
> And indeed our flist looks like the one under the section "Naive
> lock-free stack which suffers from ABA problem." on that page. :-/
> What's worse, there does not seem to be an easy fix.
> At this point, it seems likely this is the root cause for the problem.
> But I could use some input on how to proceed.
> One trivial idea just to add a mutex around the flist as an immediate
> workaround. It won't make the implementation lock-free, but futexes
> should be pretty fast, and the lock won't be held for long anyway. So
> this might be good enough?
> I'm also noticing that in most cases where we use the flist, is as an
> optimisation to avoid using malloc, essentially pointer recycling. If
> this is the case, and the objects are not usually allocated by one
> thread and freed by another, one could have per-thread flists instead of
> one flist shared by all threads.
One of the most important uses for the flists is the mempool. Allocating
memory from the mempool must be real-time safe, which is why I really
wouldn't like to add locking there. At times I've been thinking how to
split the one big mempool into per-thread mempools, but since the
memblocks regularly are allocated in one thread and freed in another, it
seems very complicated. If a good per-thread mempool solution is found,
I'm all for it.
> Any other ideas?
We could revert back to the original flist implementation by Lennart.
The current implementation was made by Jyri Sarha (added to CC) for
Nokia. The flists were reimplemented because of performance problems;
see the commit message for details.
More information about the pulseaudio-discuss