[pulseaudio-discuss] [PATCH v4] Make module loopback honor requested latency

Sun Feb 8 02:21:52 PST 2015

08.02.2015 13:21, Alexander E. Patrakov wrote:
> 08.02.2015 02:14, Georg Chini wrote:
>> Sorry, but I do not think the smoother is the problem here. I do get
>> quite reliable latency results.
>> The problem is really (if there is a problem at all) the execution time
>> of the code. These are not
>> asynchronously called functions, they wait until they are finished. And
>> that exactly is the problem,
>> sometimes the queue is quite full and so it takes a lot of time until
>> the pa_asyncmsgq_send() returns.
>> The smoother was one of the first things I suspected to be responsible
>> for false reports, but I could
>> not verify it. Since I measure the time for the second call to
>> pa_asyncmsgq_send() the numbers
>> reported look ok.
>> I think there really is some "latency jitter" that cannot be avoided -
>> interrupts that cannot be handled
>> immediately, USB bus in use when the sound card wants to deliver data
>> and you can probably come
>> up with a lot more situations where it is possible that data cannot be
>> delivered on time. You'll never get
>> perfect results when you measure something and so you cannot apply a
>> perfect regulator even
>> if it would be nice in theory.

OK, but I cannot reproduce anything of this kind at home, so cannot 
really argue. Anyway, we could implement some kind of median filter in 
order to suppress this jitter - but this would be an incremental 
improvement, let's first review what we have :)

>>
>>> Unfortunately, this is easy to recommend, but I can't really see how
>>> this can be done.
>>>
>>> The smoother is updated _after_ a successful write to the alsa device
>>> (via traditional UNIX write or via mmap), while the pop callbacks are
>>> executed just before that. So, they are called at the moment when the
>>> influence from a bad rate estimation via the smoother is the greatest.
>>>
>>> Now the suggestions.
>>>
>>> I think that, ideally, for such use cases, it is important to have
>>> timestamped latency snapshots for both sinks and sources in PulseAudio
>>> core. This would mean introducing a new message that gets the latest
>>> reliable latency snapshot (i.e., timestamp according to the wall
>>> clock, send/receive counter, input/output buffer size, and the latency
>>> itself), without any interpolation. If the sink does not implement
>>> this, just fallback (in the generic sink code) to getting the current
>>> latency. Also, because such snapshots for the sink and the source will
>>> not happen at the same moment, you have to deal with it.
>>
>> You can actually try and get both snapshots at the same time. I did this
>> and was quite astonished to
>> find that the results were less reliable this way. I could not figure
>> out why. (You can call
>> get_latency_in_thread() for source and sink from both snapshot functions
>> without crashing pulse,
>> at least when you make sure they are ONLY called from the timer
>> function. Something else seems to
>> call one of the snapshots for whatever reason).

I have read this, but have no ideas. Anyway, latency reports obtained 
this way also rely on the smoother.

>>
>>>
>>> Also, I have a very heretic thought. Namely, that the smoother in the
>>> alsa sink and source may actually be a bad idea and is better removed.
>>> I have not tested this. But it is used only in two places: for
>>> reporting latency (where it confuses your module) and for calculating
>>> the amount of time to sleep
>>
>> As I said, I think the latency deviations I see are real and not
>> artifacts, so there is no confusion.
>>
>>> in the case of timer-based scheduling (where even module-alsa-sink
>>> does not trust the result, i.e. discards it if it is greater than the
>>> non-transformed time interval). And, if I recollect correctly, there
>>> were complaints about it being fooled by batch cards, and they were
>>> cited as one of the reasons not to enable timer-based scheduling on
>>> batch cards. So - maybe, for the purposes of timer based-scheduling we
>>> should just assume the worst case, i.e. the card that is, say, 0.75%
>>> faster than nominal, and use the nominal rate together with the latest
>>> snapshot time in {source,sink}_get_latency()? Basically, the fear is
>>> that the smoother makes a greater mistake in the estimated rate than
>>> just assuming the nominal one. Maybe you can try this suggestion?
>>>
>>
>> For timer based scheduling the regulator works perfect, you would not
>> even need a stop criterion,
>> so why bother?

I think there is some misunderstanding. Let me repeat in a different way.

The smoother works perfectly (both for timer-based scheduling and for 
the needs of your module) on non-batch cards.

But, even for batch cards, where timer-based scheduling is disabled, the 
smoother is active and is actually used for reporting the latency to 
your module. An attempt to use the smoother for timer-based scheduling 
on batch cards has failed. That's why I suspect that it, on batch cards, 
also tells lies to your module.

>>
>>> For Tanu's patch status page: please leave the status of this patch as
>>> unreviewed. The general idea of the patch does not look brilliant, but
>>> it's the best known-working idea that we currently have on the topic,
>>> and I have not reviewed all the fine details.
>>>
>>
>> Well from a practical point of view it does a pretty good job although
>> the idea may not be brilliant.
>> I'm willing to implement your better idea when you come up with it.
>>   Did you ever test it? And compare it to what the current
>> module-loopback does?
>
> I did not test it, will do it now and add some logging in order to
> verify what you said above. And hopefully will try to implement an
> alternative latency-snapshotting implementation, just to compare.
>

I can confirm (based on a reimplementation attempt) that the code after 
patching deals with the capture and playback timestamp difference 100% 
correctly - so it cannot be the problem. Just a minor nitpick: I moved 
saving of the timestamp to the message handlers. For me, this makes no 
difference, though. The patch (to be applied on top of yours) is 
attached. Could you please confirm or disprove that it makes no 
difference in your setup, either?

So, the current status of the patch, from my viewpoint, is:

1. The patch adds a perfectly correct (assuming no xruns) way to account 
for latency snapshots being made not simultaneously for playback and 
capture. I think that this is the main improvement, and it needs to be 
merged even if we disagree on the rest.

2. The result has an optimal coefficient that relates the observed 
latency difference and the resulting rate correction, assuming the 
currently-implemented way to snapshot the latency and assuming no 
interference from the smoother - which still has to be verified 
independently, possibly after merging.

3. The patch adds buffer_latency_msec, which seems to be an unrelated 
improvement, and I think it should be split out. I have no opinion on 
whether this change should be merged.

4. The patch has a criterion when to stop adjusting rates, and it is a 
source of disagreements. But I could not suggest anything constructive. 
So I think that a good approach would be to split it out and let others 
comment. Also, it would be a good idea to add a debugging message so 
that we can see when it happens.

If you want, I can do the splitting for you.

-- 
Alexander E. Patrakov
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-Moved-timestamp-saving-closer-to-latency-saving.patch
Type: text/x-patch
Size: 4227 bytes
Desc: not available
URL: <http://lists.freedesktop.org/archives/pulseaudio-discuss/attachments/20150208/ce49d7bd/attachment.bin>