[pulseaudio-discuss] Best Case Latency

Thu Oct 17 05:04:12 PDT 2013

Sorry for jumping right in without having read the complete thread, but
anyway...

On 10/07/2013 11:23 AM, Patrick Shirkey wrote:
> On Mon, October 7, 2013 7:41 pm, Tanu Kaskinen wrote:
>> Ok, so this happens when the native protocol tries to send a block of
>> audio from the jack source thread to the main thread. My guess is that
>> the main thread is not getting enough CPU time to deal with all the
>> incoming audio (it's not RT scheduled).
>>
>> This could very well cause growing latency. If the jack source pushes
>> audio blocks to the message queue faster than the main thread reads
>> those blocks, then that message queue becomes a significant (and
>> constantly growing?) audio buffer.

Constantly growing does not sound right - over time, there should be
enough CPU to run all threads (RT and non-RT), if not, we're too short
of CPU in total, which is not a priority problem.

>> What to do? I don't know. Currently there's no upper limit for how long
>> the message queue can grow (so this is effectively also a memory leak
>> problem), and no way to query the queue length. 

In practice, when starting a recording stream to an app and that app
hangs, you quite soon start to get either "Pool full" or "Failed to push
data into queue" errors. I don't remember which one of these it was, and
I haven't investigated why, but I don't think we memory leak infinitely.

>> Even if there was a
>> maximum length for the queue, what should be done if that maximum length
>> is exceeded? Perhaps the native protocol should kill the stream that is
>> not able to keep up with the source.
>>
> 
> There are two options that JACK allows for
> 
> 1) Kick the offender from the graph
> 2) Allow dropouts in the stream (softmode)
> 
> I suppose in this case PA is trying to stop dropouts from happening which
> is a noble cause because most of the audio flowing through PA is probably
> fine with a variable buffer.
> 
> From the user perspective combining JACK and PA together to provide low
> latency high performance audio I'm not sure if anything should be done to
> change the existing system. However from the developer perspective this
> leans in favour of adding Bluez support to JACK directly so that systems
> that explicitly require low latency with hard time limits can keep the
> bluetooth stream running while JACK is also running. I'm not sure how that
> affects managing policy with Murphy.
> 
> There are a couple of things that have been identified in this test process.
> 
> - PA Stream Buffer adds approx 10ms latency to the stream at 64/48000/2
> - PA main loop handles the audio stream in a way that allows the buffer to
> grow causing variable latency on systems that cannot keep up.
> - I should try to find out why realtime is not working for PA+JACK on my
> test system.
> 
> It seems there are a some of issues to figure out in terms of supporting
> the combination of JACK + PA.
> 
> Given that the PA Stream buffer adds 10ms latency and there are cases
> where 20ms is the max time available to the entire audio graph is it
> viable that we should try to make PA + JACK more efficient?
> 
> For example enabling apps to bypass the Stream Buffer while JACK is running.
> 
> Is it a productive use of resources or is it better to recommend to
> developers to add direct support to their audio system for routing audio
> through JACK when JACK is running? That also means adding logic to JACK to
> deal with Bluez and Murphy which seems like a double up of effort when it
> is already being handled in PA.
> 
> If we go with the combination can PA manage the Bluetooth stream and
> Murphy requests while JACK is running?
> 
> I'm not advocating one way or another although I lean in favour of
> combining PA+JACK rather than extending JACK.

I would lean in that favour too, because optimising PA would have nice
side effect for people using just PA too.

JACK is designed and optimised for low latency, PA is designed and
optimised for high latency. That's the basic problem in a nutshell.

But we want PA to work well in low latency scenarios too (e g VoIP and
gaming), without consuming too much CPU. Optimising PA for low latency
will help both PA+JACK combination and PA only scenarios.

So, I'm interested in reducing PA's CPU consumption under low latency
operation, which I see as the biggest problem right now, especially on
embedded/mobile platforms (which Ubuntu is indeed targeting).
When I tried to run some profiling on this a while ago, I came up with a
few patches (which are in 4.0) that help some. E g, there is now support
for setting maxlength as a way to say "I prefer underruns over increased
latency". (That code is quite new and I don't know how real-world tested
it is though, feel free to test.)

...and there is still a lot of work to do, including the
all-data-goes-through-the-main-thread issue...

-- 
David Henningsson, Canonical Ltd.
https://launchpad.net/~diwic