<html> <head> <base href="https://bugs.freedesktop.org/" /> </head> <body> <div> <a class="bz_bug_link bz_status_NEW " title="NEW - PulseAudio gets reliably killed upon a big number of client connections" href="https://bugs.freedesktop.org/show_bug.cgi?id=94629#c7">Comment # 7</a> on <a class="bz_bug_link bz_status_NEW " title="NEW - PulseAudio gets reliably killed upon a big number of client connections" href="https://bugs.freedesktop.org/show_bug.cgi?id=94629">bug 94629</a> from <a class="email" href="mailto:darwish.07@gmail.com" title="Ahmed S. Darwish <darwish.07@gmail.com>"> Ahmed S. Darwish</a> <pre>After a lot of enlightening discussions with Alex, it seems this is a well-known problem in Pulse. For completeness of this bug report, here are the basic points: 1. Linux Audio Conference 2015, "Timing issues in desktop audio playback infrastructure", by Alexander slides: <a href="http://lac.linuxaudio.org/2015/download/rewind-slides.pdf">http://lac.linuxaudio.org/2015/download/rewind-slides.pdf</a> The issue of unsolicited kills are _clearly_ summarized in slide #13 above: "to process (resample, mix, encode) 2000 ms of sound under the limited budget of 200ms of real-time. Not easy: on a weak CPU, a cpufreq-governed CPU, with software DTS encoder, under valgrined, etc ... Result: KILLED" Even more details are in the video conference and paper of the same topic here: <a href="http://lac.linuxaudio.org/2015/video.php?id=8">http://lac.linuxaudio.org/2015/video.php?id=8</a> 2. A second suggestion is to let PA appropriately program its realtime soft limit and install the appropriate SIGXCPU handlers in PA. This way, we can be almost sure that the kills are due to exceeding our budget. [ This is also the view favored by kernel developers as they don't won't to pollute the kernel logs much. <a href="http://www.gossamer-threads.com/lists/linux/kernel/1513490#1513490">http://www.gossamer-threads.com/lists/linux/kernel/1513490#1513490</a> ] 3. A third and final suggestion is to write some abusive clients to demonstrate how common the issue is, and that it's not only related to the number of connected clients, but to the issue of excessive rewinds and abusive clients in general "You could write a client that does a lot of rewinds, calls pa_stream_write with bad timing (e.g. rewinds 990 ms and writes 1s every 10 ms) and see whether it explodes :) .. I don't expect it to explode with one client, but two may be enough in your case" ==> Raw discussion log: <patrakov> darwish: hello. the "realtime budget" problem that you reported is actually a known issue for my DTS encoder. There, even one stream is enough on typical hardware if PulseAudio is left with its default of mixing 2 seconds ahead <darwish> patrakov, hi :-) .. oh, I see <darwish> patrakov, seems it'll need some deep surgery to solve this while keeping interrupts low <patrakov> indeed <patrakov> and in fact I am on the fence whether to remove the low-interrupts feature, as it never worked correctly with processing such as resampling <patrakov> i.e. it may be that we just have to accept the 0.7w hit <darwish> hmm <patrakov> please see <a href="http://lac.linuxaudio.org/2015/video.php?id=8">http://lac.linuxaudio.org/2015/video.php?id=8</a> (slides are enough) <darwish> patrakov, slide #13 summarizes everything really nicely :D <patrakov> I also encourage you to take a look at CRAS source code - it has some efficient client-to-server communication method, so that the overhead from going down to 28 ms latency is only 0.2w, which is IMHO very tolerable and makes rewinds (which, together with speculative mixing ahead, are responsible for eating the realtime budget in your case) unneeded <darwish> hmm <patrakov> basically the current 2000 ms default for the tsched buffer is based on the assumption that mixing is cheap, and that mixing 2000 ms of ausio should eat no more than 200 ms anyway <darwish> patrakov, unless a high amount of clients connect, leading to excessive rewinds .. <patrakov> which is false if the CPU is slowed down by the cpufreq framework - it just doesn't see enough load to bump the frequency <darwish> patrakov, btw thanks a lot! I finally understood the concept of rewinding from your slides :D [...] <darwish> hmmm .. "CRAS doesn’t have any of the discussed workarounds" <patrakov> what was meant is: "CRAS doesn't have any of the discussed workarounds and still works fine on hardware found in Chromebooks" <patrakov> no rewinds = no need to guess how much it is possible to rewind, no need to deal with non-rewindable ALSA plugins, no need to write a rewindable resampler, no correctness issues, at the cost of 0.2w of extra power consumed (and if we assume that Chrome is the only possible client, then that's 0.0w) <patrakov> because Chrome never actually uses high latency <darwish> just found some slides by the CRAS folks here .. they also compare themselves with PA: <a href="http://goo.gl/zdmNu4">http://goo.gl/zdmNu4</a> <patrakov> they indeed share a lot of ideas [...] <darwish> for completeness I'll add excerpts from the discussion above to the bug report + links your slides and video conference <patrakov> basically, I want you to actually write a client that does a lot of rewinds, calls pa_stream_write with bad timing (e.g. rewinds 990 ms and writes 1s every 10 ms) and see whether it explodes :) <patrakov> I don't expect it to explode with one client, but two may be enough in your case <darwish> that client would be a nice discussion entry point :-) <darwish> I'm now working on some patches for the kernel to inform us when it kills PA.. will develop that client, and hopefully see how to fix this, afterwards <patrakov> why do we need those patches? <patrakov> doesn't the kernel already send SIGXCPU when the soft-limit is exceeded? <patrakov> shouldn't we just set the soft limit correctly in PulseAudio? <darwish> it does .. that was the argument too from tglx <darwish> patrakov, <a href="http://www.serverphorums.com/read.php?12,450582">http://www.serverphorums.com/read.php?12,450582</a> <patrakov> oh, ok <darwish> patrakov, but yeah .. I've asked myself too if it's better to just appropriately handle SIGXCPU <darwish> so I'm not sure if the kernel devs will accept the patch, honestly <patrakov> on the other hand, can we handle SIGXCPU properly in the case when the CPU hog is a DTS encoder? It is not really "actionable" upon, other than logging a message. <patrakov> you can set a flag that says "stop further mixing", but it is useless if we are DTS-encoding, not mixing <darwish> I can at least log a message in PulseAudio .. so when a user submits a bug report with PA killed, and see that message, we are 99% sure we've just exceeded our limits <patrakov> Fair enough <darwish> and in that case we won't need the kernel patch I guess .. [...] <darwish> OK I'll go and have some lunch now (and watch the linux audio conference video in the process ;-)) .. thanks a lot for this discussion, I've learned a lot :-)</pre> </div> <hr> You are receiving this mail because: <ul> <li>You are the QA Contact for the bug.</li> <li>You are the assignee for the bug.</li> </ul> </body> </html>