[pulseaudio-discuss] Semaphore lockup when using threaded mainloops excessively

Colin Guthrie gmane at colin.guthr.ie
Fri Apr 22 03:56:11 PDT 2011


'Twas brillig, and Daniel Mack at 22/04/11 10:20 did gyre and gimble:
> On Fri, Apr 22, 2011 at 10:58 AM, Colin Guthrie <gmane at colin.guthr.ie> wrote:
>> 'Twas brillig, and Colin Guthrie at 01/04/11 14:25 did gyre and gimble:
>>> I've pushed this to git master now so that more people can test.
>>
>> I've just be revisiting this one after your pa_poll changes.
>>
>> I tried running two at once and both bailed quite quickly with:
>>
>> Connection (23 of 1000) established.
>> Stream error: Too large
>> Aborted
>>
>>
>> Connection (15 of 1000) established.
>> Stream error: Too large
>> Aborted
> 
> That is likely a different issue.

Yeah, it is. I've got a commit in my tree now that reduces the number
for streams to be:

#define NSTREAMS ((PA_MAX_INPUTS_PER_SINK/2) - 1)

This isn't totally safe, but it leaves just a little room for running
two tests at the same time and having a couple other streams sneak in
via regular usage. You OK with that?

>> The abort happened at the same time so both tests aborted at the same
>> time. But I rant it again, and both instances happily ran up to >500
>> connections each.
>>
>>
>> The only (relevant) place I can see this error occurring is in
>> sink-input.c in pa_sink_input_new()
>>
>> But this should only happen when > 32 streams are played on a a sink....
>>  as each test can use 16 streams I guess there are times when >32 is
>> possible. Perhaps the test should limit it to 15 streams such that this
>> likelyhood is reduced?
>>
>> Anyway a run up to ~500 is pretty good. Is this more or less fixed now
>> do you think?
> 
> It is indeed fixed by Lennart in commit 575ba65714 ("memblockq: decode
> unset chunks as NULL chunks again") from last night. We had a little
> session, were able to reproduce it and he fixed it within an hour or
> so :)

Yes, it was that commit that prompted me to reply to this thread
actually! That said, I do not have it applied right now and I still
wasn't able to reproduce the problem even running two connect-stress
tests to the end. Ahh well, I guess it needs more contention or a slower
machine or something (4yr old laptop here tho'!)

Anyway, thanks for confirming it's all fixed now :)

> Another fix for OSX (only) is in my tree now, I'll send a pull request soon.

Cool, no worries :)

Col

-- 

Colin Guthrie
gmane(at)colin.guthr.ie
http://colin.guthr.ie/

Day Job:
  Tribalogic Limited [http://www.tribalogic.net/]
Open Source:
  Mageia Contributor [http://www.mageia.org/]
  PulseAudio Hacker [http://www.pulseaudio.org/]
  Trac Hacker [http://trac.edgewall.org/]




More information about the pulseaudio-discuss mailing list