[pulseaudio-discuss] alsa sink latency - how to account for startup delay

Wed Mar 23 13:59:13 UTC 2016

On 23.03.2016 14:00, Tanu Kaskinen wrote:
> On Wed, 2016-03-23 at 13:09 +0100, Georg Chini wrote:
>> On 23.03.2016 12:26, Tanu Kaskinen wrote:
>>> On Wed, 2016-03-23 at 11:34 +0100, Georg Chini wrote:
>>>> On 23.03.2016 11:04, Tanu Kaskinen wrote:
>>>>> On Tue, 2016-03-22 at 12:57 +0100, Georg Chini wrote:
>>>>>> Look at the code of alsa-sink. It never drops samples. The only way to
>>>>>> compensate
>>>>>> for the startup delay would be to drop audio as long as the sink is not
>>>>>> yet playing,
>>>>>> but that is not done. I could try to implement that however and then you
>>>>>> would be
>>>>>> right, but with the current code at least for the alsa-sink the startup
>>>>>> delay will persist.
>>>>> The sink isn't responsible for dropping samples in any case. The
>>>>> connection between the start delay and the runtime latency just doesn't
>>>>> exist at the sink level.
>>>> Again, sorry, but you are wrong. The startup delay does not vanish
>>>> magically. I can only point you to the code. I have been working with it
>>>> for a couple of month now and I know what I am saying. What happens
>>>> for USB devices with timer based scheduling is the following:
>>>>
>>>> 1) First audio data is written to the card
>>> Ok, now the buffer is full.
>> It looks like it isn't, see below.
> Did you misunderstand what I meant with "buffer" here? I could have
> been more clear: I meant that at this point we have filled the alsa
> ring buffer up to the configured latency amount of data. If there are
> other buffers after the alsa ring buffer - and there certainly are,
> although 10 ms sounds excessive - then those buffers are not full when
> we have just written our first chunk of audio.

Yes, looks like there are other buffers that need to be filled before
the card starts playing. And you are right, I did misunderstand,
for me the "buffer" referred to all the buffers that need to be filled
before playback starts.

>
>>>> 2) snd_pcm_start() is called
>>>> 3) More data is written to the card
>>> How is this possible? If the hardware hasn't consumed any audio yet, we
>>> won't write anything, because the buffer is full. If the hardware has
>>> consumed something, but nothing has come out of the speakers yet, then
>>> the hardware has the consumed audio buffered somewhere.
>> I don't know how this is possible, but that is what happens. It looks
>> like the buffer isn't full after the first write. mmap_write() returns
>> true a couple of times before anything is actually played and data
>> _is_ added to the buffer.
>>
>>>> 4) The reported delay of the card goes up exactly by the amount that was
>>>> written
>>>> 5) This repeats a couple of times
>>> If nothing is still coming out of speakers, this indicates that the
>>> hardware reports the latency wrong, or our smoother is messing up the
>>> latency reports. AFAIK, if the hardware has a 10 ms buffer, that should
>>> be always included in what snd_pcm_delay reports. If it reports
>>> something less, it's a driver bug.
>> I don't think there are any (more than usual) wrong reports - these are
>> values that are clearly visible on the oscilloscope, and correcting the
>> latency by the amount (by checking when the condition explained below
>> is true) leads to the correct latency. You cannot deny something that is
>> measurable just because it should not be that way.
>>
>> I am talking about the delay reported by pa_alsa_safe_delay(). The delay
>> reported by this call does exactly match u->write_count during the first
>> milliseconds, which means that the alsa driver is buffering, but not
>> playing. Playback starts at the moment, when u->write_count is larger than the
>> delay.
>>
>> Again - check the code. It does not care at all if the card is really
>> playing,
>> it is just pushing audio to it. It probably would get stuck at some point
>> because alsa cannot buffer infinitely, but for the startup delays that the
>> devices have, it still works.
> I did read the code before sending the previous reply. mmap_write()
> limits the write amount so that we never make the ring buffer fill
> level exceed the configured latency. If the configured latency is 5 ms,
> it's impossible (assuming no bugs) that we would write another 5 ms, if
> the hardware hasn't consumed the previous data from the ring buffer
> yet.
It looks like the driver is fetching data from the ring buffer even
if the device is not yet playing and is buffering the audio elsewhere.
The delay is reported by pa_alsa_safe_delay(), so the audio _is_
already somewhere in the alsa driver and no longer in pulseaudio.

But from the perspective of pulseaudio it means that you have some
startup delay that is currently reported nowhere and which persists.

> You confirmed that pa_alsa_safe_delay() returns values that increase
> whenever we write more data. That indicates that the smoother is not to
> blame. If you substract the current ring buffer fill level from the
> delay value, the result should be constant. It sounds like it isn't,
> and that's a driver bug.
Not sure if I would call it a bug since the delay is reported correctly
by pa_alsa_safe_delay(). I think pulseaudio should be aware of such
behavior and be able to handle it.
Maybe the USB alsa driver is not the only one which implements some
kind of double buffering.
So what do you think is the best way to work around the problem?
As already said, the moment the card starts really playing can be
detected by checking if
u->write_count - delay * frame_size > 0

BTW, the USB alsa driver has another very weired bug. The reported delay
is corrected some time (between 1s and 2s) after the device has started up.
This means, at the beginning the driver reports a delay that is a couple of
ms too small and then the delay suddenly jumps up to the correct value.
The real (measured) delay does not change, just the reported numbers.
This is not something that only one device or machine is doing, the effect
is visible on all devices and machines I have access to and also on 
Alexander
Patrakov's system. The effect is less pronounced when the USB device is
run in batch mode, then the fix up is somewhere between 1 ms and 2 ms,
but for timer based scheduling it is more than 2ms.