[pulseaudio-discuss] alsa sink latency - how to account for startup delay

Wed Mar 30 16:06:36 UTC 2016

On Tue, 2016-03-29 at 20:29 +0200, Georg Chini wrote:
> On 29.03.2016 05:56, Georg Chini wrote:
> > 
> > On 29.03.2016 02:13, Pierre-Louis Bossart wrote:
> > > 
> > > On 3/28/16 12:38 PM, Georg Chini wrote:
> > > > 
> > > > On 28.03.2016 17:18, Georg Chini wrote:
> > > > > 
> > > > > On 28.03.2016 16:16, Pierre-Louis Bossart wrote:
> > > > > > 
> > > > > > On 3/22/16 4:11 AM, Georg Chini wrote:
> > > > > > > 
> > > > > > > Hi,
> > > > > > > 
> > > > > > Ups, looks like I misread your sentence above. You are right, in
> > > > software you can't
> > > > see the output. But what you can see is the time between dispatching 
> > > > the
> > > > audio
> > > > to the USB bus and the time when the bus reports back that the audio 
> > > > was
> > > > played.
> > > > 
> > > > I am not sure if I read the code right, I don't know anything about 
> > > > USB,
> > > > but if the
> > > > URB's you are submitting to the bus in the prepare step are handled in
> > > > chronological
> > > > order, it means that
> > > > a) you have to wait for the next URB to retire before you can send any
> > > > real audio
> > > > b) after you submitted the first audio, it is at the end of the queue
> > > > and all the other
> > > > URB's containing silence will be processed first.
> > > The USB driver will submit N silence URBs on startup in the prepare 
> > > and you will have to wait for those URBs to retire before the samples 
> > > are queued. There is very little 'USB processing'. If you want to 
> > > reduce this delay you have to use smaller periods, it'll decrease the 
> > > size of the URBs. I guess it could be possible to change the URB size 
> > > after the start but that's not implemented atm.
> > > 
> > I don't want to shorten the latency. I only want the latency reported 
> > correctly. To me it still
> > looks like the real latency of the driver is not what it reports, 
> > because the time that the
> > audio spends in the URB's is not taken into account. What I am seeing 
> > is, that the real
> > latency is around 10ms longer than expected.
> OK, I thought about it another time and I believe I figured out what the 
> problem is.
> 
> (Sometimes I feel like I am talking Chinese to you - I seem not to be 
> able to express
> myself in a way that others can understand, sorry for that. I'll try my 
> best now.
> If we would sit in one room it would probably be sorted out in half an 
> hour.)
> 
> The starting point is the observation, that with module-loopback and an 
> USB sink
> the real end-to-end latency is about 10ms longer than configured. This 
> cannot be
> observed with a HDA sink. Let's go through the sequence of events when
> module-loopback is started:

I'm interested in what delay values alsa reports, so I'll write what I
expect the delay to be after each step. One unclear thing is that can
the driver report the delay at higher granularity than one URB (= 5 ms
in this example)? I'll assume here that it can report the latency with
one sample accuracy.

I'll also assume that audio is moved from the ring buffer to the URBs
atomically, so the ring buffer fill level jumps by 5 ms every time a
URB is filled.

> 1) The module sets up a buffer and fills it with silence up to the 
> configured loopback latency.
> Sink and source are configured to 1/3 of the loopback latency. In this 
> example, we will use
> 30ms as loopback latency, so that sink and source are configured to 10ms.

I suppose prepare() has not been called in the sink yet, so currently
the delay is undefined.

> 2) Source and sink are started. For simplicity we assume that the source 
> starts delivering
> samples immediately.

I guess the sink still hasn't called prepare()?

> 3) As long as sink_input_pop() has not been called, module-loopback 
> drops the audio
> coming in from the source.
> 
> 4) On the alsa side, the preparation has been done. Again, to simplify 
> we assume 2 URB's
> with 5ms each have been set up and submitted to the USB bus. The first 
> URB starts
> playing silence to the speakers now.

prepare() has been called. If we assume that the DAC has not yet
consumed anything from the first URB, the "USB bus" latency is 10 ms.
PulseAudio hasn't written anything yet to the ring buffer. Expected
delay: 10 ms.

> 5) The pulseaudio sink code takes the first 10ms of audio out of the 
> loopback buffer,
> writes it to the alsa buffer and calls snd_pcm_start().

If the sink takes something from the loopback buffer, this means that
the first pop() call has been made. Assuming no time has passed since
the previous step, the USB bus is still full, and so is the ring
buffer. Expected delay: 20 ms.

> 6) Now sink_input_pop() is called, and that is where things go wrong, 
> because the
> audio from the source is buffered from now on.

Is this already the second pop() call? That means that at least one URB
has been filled from the ring buffer, otherwise pop() wouldn't have
been called due to the ring buffer being full.

During the pop() call the ring buffer contains 0 ms of audio, if two
URBs have been filled from the ring buffer, or 5 ms, if only one has
been filled.

Expected delay during the pop() call: 0 or 5 ms + whatever the DAC
hasn't yet consumed from the USB bus. Expected delay after the sink has
finished rendering and writing to the ring buffer: 10 ms + whatever the
DAC hasn't yet consumed from the USB bus.

> 7) On the alsa side nothing has happened yet. The reported delay is what 
> is in the alsa
> buffer - 10ms. Alsa has to wait for the first URB to expire before any 
> sample can be
> submitted to the bus.

How can nothing have happened yet on the alsa side? pop() has been
called twice already, which means that the alsa driver has already
consumed audio from the ring buffer. I'd count that as something
happening.

If the first URB hasn't yet expired (ignoring the conflicting
information that pop() has been called twice already), it doesn't make
sense that alsa would report 10 ms delay. The two submitted URBs add 10
ms to the delay, minus everything that the DAC has consumed at this
point, which I assume to be very little. On top of that there's 10 ms
of audio written to the ring buffer. If alsa reports the full delay to
be 10 ms, that's wrong, don't you think? It should report 20 ms (or
something close to that to account for whatever the DAC has consumed).

> 8) Now the first URB has retired and can be filled with audio. After the 
> URB is submitted to
> the bus, the alsa delay is NOT updated and still shows 10ms. The second 
> URB starts playing
> silence for 5ms. Because the buffer seems filled to the correct level, 
> the pulseaudio sink
> code will wait until the fill level gets lower.

When the first URB is retired and filled with audio from the ring
buffer, that's when the ring buffer fill level jumps from 10 ms to 5
ms. You seem to be saying that after this has happened, the alsa sink
doesn't fill up the ring buffer. Why is that? 5 ms is half of the full
buffer, and that should be low enough to trigger a refill.

> 9) The second URB has retired, is filled with audio and submitted to the 
> bus. Still the
> delay is NOT updated, but from now on, it is correct, since a sample 
> that would now be
> submitted to the alsa buffer would take 10 ms to the speakers. The first 
> real audio starts
> playing.

Here you seem to be describing a situation where the ring buffer is
empty, and indeed, in that situation the reported delay should be 10
ms. I just don't understand why the ring buffer would be empty (that
is, why didn't pulseaudio refill the ring buffer after the first URB
was consumed from the ring buffer).

> 10) Meanwhile the source has been producing samples that have been queued in
> the loopback buffer and that make up the extra latency I have been 
> observing. In
> this example it would be 5ms from the second URB plus the time that was left
> from the first URB when snd_pcm_start() was called.
> 
> 11) The first URB-full of audio has been played and the URB retires. Now 
> the alsa
> delay is adjusted to reflect that the first 5ms have been played and the 
> audio flow
> is completely set up.

If the ring buffer was empty in the previous step, and we didn't write
anything after that, then the retirement of an URB now would cause an
underrun, since the retired URB can't be refilled from the ring buffer.

> If the first sink_input_pop() would be called at this point, everything 
> would be fine.
> So the problem seems to be that sink_input_pop() is called too early.

I don't know, your description contained many strange parts. I don't
know what is really happening. But if module-loopback heavily relies on
the configured latency to be the same as the real latency, then that's
going to cause trouble anyway, because they just aren't the same thing,
even if the alsa driver is reporting everything perfectly.

-- 
Tanu