[pulseaudio-discuss] [PATCH 02/21 v2] loopback: Initialize latency at startup and during source/sink changes

Georg Chini georg at chini.tk
Sat Feb 25 13:01:42 UTC 2017

On 25.02.2017 13:51, Tanu Kaskinen wrote:
> On Sat, 2017-02-25 at 12:13 +0100, Georg Chini wrote:
>> On 24.02.2017 19:19, Tanu Kaskinen wrote:
>>> On Thu, 2017-02-23 at 17:55 +0100, Georg Chini wrote:
>>>> On 23.02.2017 12:17, Tanu Kaskinen wrote:
>>>>> On Sun, 2017-02-19 at 17:15 +0100, Georg Chini wrote:
>>>>>> +static void set_source_output_latency(struct userdata *u, pa_source *source) {
>>>>>> +    pa_usec_t latency, requested_latency;
>>>>>> +
>>>>>> +    requested_latency = u->latency / 3;
>>>>>> +
>>>>>> +    latency = PA_CLAMP(requested_latency , u->min_source_latency, u->max_source_latency);
>>>>>> +    u->configured_source_latency = pa_source_output_set_requested_latency(u->source_output, latency);
>>>>>> +    if (u->configured_source_latency != requested_latency)
>>>>>> +        pa_log_warn("Cannot set requested source latency of %0.2f ms, adjusting to %0.2f ms", (double)requested_latency / PA_USEC_PER_MSEC, (double)u->configured_source_latency / PA_USEC_PER_MSEC);
>>>>>> +}
>>>>>> +
>>>>>>     /* Called from input thread context */
>>>>>>     static void source_output_attach_cb(pa_source_output *o) {
>>>>>>         struct userdata *u;
>>>>>> @@ -435,12 +563,26 @@ static void source_output_moving_cb(pa_source_output *o, pa_source *dest) {
>>>>>>         if ((n = pa_proplist_gets(dest->proplist, PA_PROP_DEVICE_ICON_NAME)))
>>>>>>             pa_sink_input_set_property(u->sink_input, PA_PROP_DEVICE_ICON_NAME, n);
>>>>>> -    if (pa_source_get_state(dest) == PA_SOURCE_SUSPENDED)
>>>>>> -        pa_sink_input_cork(u->sink_input, true);
>>>>>> -    else
>>>>>> +    /* Set latency and calculate latency limits */
>>>>>> +    update_latency_boundaries(u, dest, NULL);
>>>>>> +    set_source_output_latency(u, dest);
>>>>>> +    get_effective_source_latency(u, dest, u->sink_input->sink);
>>>>>> +
>>>>>> +    if (pa_source_get_state(dest) == PA_SOURCE_SUSPENDED) {
>>>>>> +        if (dest->suspend_cause != PA_SUSPEND_IDLE)
>>>>>> +            pa_sink_input_cork(u->sink_input, true);
>>>>>> +    } else
>>>>>>             pa_sink_input_cork(u->sink_input, false);
>>>>> Is the intention to cork the sink input if the source is suspended for
>>>>> a non-idle reason, and uncork it otherwise? This code doesn't do
>>>>> anything if the source is suspended due to being idle, so if the sink
>>>>> input was corked before, it will stay corked, which doesn't make much
>>>>> sense to me, but maybe I'm missing something, because the old code
>>>>> didn't uncork the sink input in this situation either.
>>>> If the source is suspended for idle reason, we know it will start
>>>> when it is needed by the module. So there is no need to cork the
>>>> sink input.
>>> Ok, makes sense. The old code didn't have this check, however. Did you
>>> observe a problem with that?
>> Yes, it makes the calculation of the latency during a switch very
>> difficult because the stream gets interrupted unnecessarily. Otherwise
>> I would not have changed it.
> Ok. As I said earlier, the problem description should be in the commit
> message.
>>>> If the sink input has been corked previously for some other reason,
>>>> we should not uncork it here (for example module-role-cork might
>>>> have done so).
>>> module-role-cork doesn't actually cork the stream, it just sends a cork
>>> request (which module-loopback ignores). The stream can only be corked
>>> if module-loopback corked it by itself earlier. Corking is currently
>>> only allowed for the stream owner, because we don't have the means to
>>> track multiple simultaneous cork requests. If multiple things tried to
>>> control the cork state, there would be conflicts and confusion.
>>> If the loopback was connected to a suspended source before, and
>>> therefore corked the sink input, shouldn't we uncork the sink input if
>>> the loopback moves to a source that is idle-suspended?
>> I can do that as well if you like.
> Doing that seems like the right thing to do. You can also try what
> happens when moving from a user-suspended source to an idle-suspended
> source, and if it somehow happens to work, then add a comment that
> explains why it's not necessary to uncork the sink input.
>>>>>> +        u->output_thread_info.pop_called = true;
>>>>>> +    }
>>>>>> +    u->output_thread_info.first_pop_done = true;
>>>>>>         if (pa_memblockq_peek(u->memblockq, chunk) < 0) {
>>>>>>             pa_log_info("Could not peek into queue");
>>>>>> @@ -478,6 +644,12 @@ static int sink_input_pop_cb(pa_sink_input *i, size_t nbytes, pa_memchunk *chunk
>>>>>>         chunk->length = PA_MIN(chunk->length, nbytes);
>>>>>>         pa_memblockq_drop(u->memblockq, chunk->length);
>>>>>> +    /* If push has not been called yet, assume that the source will deliver one full latency
>>>>>> +     * when it starts pushing. Adjust the memblockq accordingly and ensure that there is
>>>>>> +     * enough data in the queue to avoid underruns. */
>>>>>> +    if (!u->output_thread_info.push_called)
>>>>>> +        memblockq_adjust(u, u->output_thread_info.effective_source_latency, true);
>>>>> This seems unnecessary. Why can't we just let the pop calls drain the
>>>>> memblockq, and add the necessary amount of silence to the memblockq
>>>>> only when we receive the first chunk from the source?
>>>> The whole point is to keep on delivering silence even in the case that
>>>> the source takes a long time to start. So we avoid draining the
>>>> memblockq completely, which would lead to underruns.
>>> Why do you want to avoid underruns during the startup? Just to avoid
>>> unnecessary noise in the log? (That would be a good enough reason for
>>> me.)
>>> If the memblockq becomes empty, silence can still be easily generated.
>>> The silence doesn't have to come from the memblockq.
>> The reason are the ugly underrun messages as already stated
>> above. And why should I generate silence any other way? The
>> memblockq is there and I just push some silence into it.
> Because we might get rid of the effective_source_latency variable, and
> the code would become a bit easier to understand when there's no need
> to think about whether the memblockq has the right amount of silence.
>>>>>> +                time_delta += pa_sink_get_latency_within_thread(u->sink_input->sink);
>>>>>> +
>>>>>> +                /* If the source has overrun, assume that the maximum it should have pushed is
>>>>>> +                 * one full source latency. It may still be possible that the next push also
>>>>>> +                 * contains too much data, then the resulting latency will be wrong. */
>>>>>> +                if (pa_bytes_to_usec(chunk->length, &u->sink_input->sample_spec) > u->output_thread_info.effective_source_latency)
>>>>>> +                    time_delta = PA_CLIP_SUB(time_delta, u->output_thread_info.effective_source_latency);
>>>>>> +                else
>>>>>> +                    time_delta = PA_CLIP_SUB(time_delta, pa_bytes_to_usec(chunk->length, &u->sink_input->sample_spec));
>>>>> It's unclear to me what's happening here. I guess you're substracting
>>>>> chunk->length from time_delta, because when push_cb was called, the
>>>>> chunk was still included in the source latency report (I'm a bit
>>>>> worried that some sources might not do this), but the chunk has also
>>>>> been pushed to the memblockq, so if we don't adjust time_delta, the
>>>>> chunk will be counted twice, and the memblockq adjustment will be
>>>>> wrong.
>>>> Yes, exactly.
>>>>> But I don't really get why you are concerned with "overruns", and how
>>>>> your code mitigates any problems. If you substract only a part of
>>>>> chunk->length, it means that a part of the chunk will be counted twice,
>>>>> resulting in wrong latency.
>>>> No, it means, that the memblockq will be adjusted to the correct length
>>>> because it is adjusted after the data has been pushed.
>>> The correct memblockq length is u->latency - sink latency - source
>>> latency. To me it seems that your adjustment makes us overestimate the
>>> source latency.
>>> The latency report that the source gives is (1) the current chunk
>>> length, plus (2) any buffered data that isn't included in the current
>>> chunk. In adjust_memblockq() we're only interested in (2), which is why
>>> we subtract the chunk length from the reported source latency. If the
>>> chunk was huge for some reason, adjust_memblockq() will in any case
>>> drop excess data.
>>>> If the source
>>>> has overrun and delivered much more data than expected, the excess
>>>> data will be dropped.
>>> If the source latency is overestimated, too much data will be dropped.
>> You are talking about two different things. The source latency is
>> very different from the first chunk of data that is pushed.
> Yes, they are very different things. But adjust_memblockq() needs to
> know the real source latency, it doesn't care about the size of the
> first chunk. It doesn't matter how big the first chunk is, because
> adjust_memblockq() will in any case adjust the memblockq length so that
> the end-to-end latency will match what is configured.
>> When switching quickly back and forth between two sources, it
>> happens, that a source configured to a few ms of latency
>> suddenly pushes 500 ms of data on the first push. I think this
>> is due to the fact that the source is re-configured to its maximum
>> latency when it goes to idle and when switching back to low latency
>> it somehow does not get rid of the excess data soon enough.
>> Remember, the adjustment is only done on the first push, so it is
>> a one-shot thing just to get the queue right. If I don't drop the
>> data, I end up with those 500 ms too much in the memblockq.
> The memblockq length should always be
>      u->latency - real sink latency - real source latency
> and if that is less than the 500 ms that was just pushed to the queue,
> the extra data will be dropped. I don't see how we can end up with too
> long memblockq if adjust_memblockq() gets accurate sink and source
> latency information.

Sorry, but take a look at the code. We CALCULATE how long the memblockq
should be and pass that as a parameter to adjust_memblockq() AND we have
to subtract the first chunk length to get things right. So if the first 
chunk is
too big, the memblockq will be too long, as simple as that.

>>> While reading the code again, I noticed some more issues:
>>>>            case SINK_INPUT_MESSAGE_POST:
>>>> -            pa_sink_input_assert_io_context(u->sink_input);
>>>> +            pa_memblockq_push_align(u->memblockq, chunk);
>>>> +
>>>> +            /* If push has not been called yet, latency adjustments in sink_input_pop_cb()
>>>> +             * are enabled. Disable them on first push and correct the memblockq. Do the
>>>> +             * same if the pop_cb() requested the adjustment */
>>>> +            if (!u->output_thread_info.push_called || u->output_thread_info.pop_adjust) {
>>>> +                pa_usec_t time_delta;
>>>> -            if (PA_SINK_IS_OPENED(u->sink_input->sink->thread_info.state))
>>>> -                pa_memblockq_push_align(u->memblockq, chunk);
>>>> -            else
>>>> -                pa_memblockq_flush_write(u->memblockq, true);
>>>> +                time_delta = PA_PTR_TO_UINT(data);
>>>> +                time_delta += pa_rtclock_now() - offset;
>>>> +                time_delta += pa_sink_get_latency_within_thread(u->sink_input->sink);
>>>> +
>>>> +                /* If the source has overrun, assume that the maximum it should have pushed is
>>>> +                 * one full source latency. It may still be possible that the next push also
>>>> +                 * contains too much data, then the resulting latency will be wrong. */
>>>> +                if (pa_bytes_to_usec(chunk->length, &u->sink_input->sample_spec) > u->output_thread_info.effective_source_latency)
>>>> +                    time_delta = PA_CLIP_SUB(time_delta, u->output_thread_info.effective_source_latency);
>>>> +                else
>>>> +                    time_delta = PA_CLIP_SUB(time_delta, pa_bytes_to_usec(chunk->length, &u->sink_input->sample_spec));
>>>> +
>>>> +                memblockq_adjust(u, time_delta, true);
>>>> +
>>>> +                u->output_thread_info.pop_adjust = false;
>>>> +                u->output_thread_info.push_called = true;
>>>> +            }
>>> This adjustment needs to be postponed until pop has been called. If the
>>> sink is not yet running, the sink latency is undefined.
>> No, it does not. "Undefined" means, that the pa_sink_get_latency_in_thread()
>> call just returns 0. Where is the problem with that?
> 0 doesn't correspond to the real world, that's the problem. It's
> pointless to try to make calculations based on incorrect input.

You are yourself saying I should just use 0 at another place, which does not
correspond to real world either.

>> And in fact, this adjustment is done twice if sink and source are
>> both starting, once for the source and once for the sink. Then the
>> second adjustment does the right thing regardless if sink or source
>> started first.
> The second adjustment does the right thing regardless of whether the
> first adjustment was done or not, so why not skip the first adjustment
> altogether?

Because it might be the case that we are only switching the source OR the
sink and if we are changing both or starting up, we don't know which will
come first.

>>> Also, if the memblockq is too short, memblockq_adjust() will add
>>> silence, but it's written after the valid data, so there will be a gap
>>> in playback. The silence should be added by using
>>> pa_memblockq_rewind(), but it has to be ensured that the memblockq
>>> doesn't still contain some old non-silence data that gets replayed. I'm
>>> not sure how to best do that.
>> This is the only situation where I allow pushing silence after having
>> valid data in the queue. Since this is a one (or two) time event, I think
>> it is acceptable to have a small gap in the stream to fix up the latency.
>> Because it is done in a switching situation, I believe nobody will ever
>> notice the gap.
> Ok, if it's a deliberate decision to have simpler code at the risk of
> glitches, then a FIXME comment would be appropriate.
>>>> +
>>>> +            /* If pop has not been called yet, make sure the latency does not grow too much.
>>>> +             * Don't push any silence here, because we already have new data in the queue */
>>>> +            if (!u->output_thread_info.pop_called)
>>>> +                 memblockq_adjust(u, pa_sink_get_latency_within_thread(->sink_input->sink), false);
>>> As mentioned above, the sink latency is undefined if the sink is not
>>> running yet. Luckily there's no need need to know the sink latency
>>> here: before pop has been called, the only reason why the memblockq
>>> needs to be adjusted in the POST handler is to avoid it becoming huge
>>> and thus consuming a lot of memory. memblockq_adjust() can be called
>>> here with simply zero as the offset parameter.
>> This will keep too much data in the queue if the sink is running.
>> This would not be a big problem, but why not adjust as good as possible in this
>> situation?
> Because there's no benefit in doing that. Using 0 is simpler and
> doesn't involve invoking kernel code to ask for information that might
> be bogus anyway.
Sorry, slowly I am thinking that you don't want the patches at all and 
are nitpicking
to discourage me. If you don't want the patches, just say so.

More information about the pulseaudio-discuss mailing list