[pulseaudio-discuss] And now, something is wrong with the flist implementation...

Jyri Sarha oku at iki.fi
Sun Mar 4 12:12:03 PST 2012

On Sun, 04 Mar 2012 06:05:46 +0100, David Henningsson
<david.henningsson at canonical.com> wrote:
> On 03/02/2012 04:45 PM, Tanu Kaskinen wrote:
>> On Fri, 2012-03-02 at 13:49 +0100, David Henningsson wrote:
>>> Instead of trying to verify the algorithm, I went to Google to look
>>> a reference implementation to compare against, and quickly found [1].
>>> And indeed our flist looks like the one under the section "Naive
>>> lock-free stack which suffers from ABA problem." on that page. :-/
>>> What's worse, there does not seem to be an easy fix.
> Have you ever woken up in the middle of the night with an idea of how to

> solve the problem almost clear in your mind? The wikipedia page said 
> "tag bits" was a common workaround, using the low bits of the pointer. I

> initially rejected the idea as I didn't think it would be portable 
> enough, but now I realise that the pointers are just indices to a table.
> So I've written a patch which I post separately, please review. I've 
> been running my own test case for a while and seems to have succeeded so

> far.
> @Eric Casteleijn, as you seem to be the only one that can reproduce this

> problem somewhat reliably, would you mind trying my patch for a few days

> and see if it resolves the problem? If so, I'll prepare a PPA for you to

> use for testing.

Hi David,
I wrote probably something similar (patch is attached). It would be
interesting to compare our patches, but I could not find your patch 
from the mailing list yet.

It is good to keep in mind that at least my version of the tag
approach is not guaranteed to avoid the ABA-problem, it only makes it
very unlikely. In 64bit system one may argue that it is impossible for
one thread to remain asleep while the one element it was trying to
manipulate is poped and pushed back to linked list exactly 4294967296
times and then, while the same element is still on top of the list,
the thread is scheduled again and it fall for the ABA-problem. But what
about 32bit system, can we draw the same conclusion there? I would say
the above scenario is still virtually impossible, even if you change the
number 4294967296 to 65536.

Sorry for the mess I caused. At the time of making the ABA-suffering
flist-implementation I was desperately looking for a fix to a notorious
Nokia N900 "diesel engine"-bug* and when I found the fix I was just happy
that the flist-test passed went on to the next problem.

* The sound resembling a diesel engine came when a memblock in a free list
was swapped out. While PA was waiting for the memblock to be accessible
again the alsa-device kept on playing the same 20ms DMA buffer content
and over again.

Best regards,

ps. I have not had time to test the patch on 32bit system yet.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-flist-Avoid-ABA-problem-by-using-transaction-tag-in-.patch
Type: text/x-c
Size: 7656 bytes
Desc: not available
URL: <http://lists.freedesktop.org/archives/pulseaudio-discuss/attachments/20120304/33eedca5/attachment.bin>

More information about the pulseaudio-discuss mailing list