[pulseaudio-discuss] Crackling audio with Pulseaudio 4.0 and the simple Pulse API.

Fri Jul 12 04:44:44 PDT 2013

On Thu, 2013-07-11 at 21:54 +0600, Alexander E. Patrakov wrote:
> 2013/7/11 Tanu Kaskinen <tanu.kaskinen at linux.intel.com>:
> > You don't understand me, and I don't understand you... What is the
> > result of the boundary being off? Does audio get skipped, duplicated, or
> > is there no error at all?
> 
> No duration-related error at all in my understanding, see below where
> it differs from yours.
> 
> > I'll try to clarify this (also to myself) with an example. Let's have a
> > sink, and a stream with a resampler in between. For simplicity, let's
> > assume that the resampler doesn't actually do any resampling, so when
> > the sink asks for 10 samples, the resampler reads from the stream 10
> > samples.
> >
> > Let's say that the write index of the sink is N, and the resampler has
> > one sample buffered. Due to the buffered sample, the read index of the
> > stream is N+1.
> 
> Let me rephrase this in order to check that I understand. You are
> talking about a resampler that has 1:1 input:output sample rate ratio.
> The resampler needs to look by one sample ahead (or behind, depending
> on how you look at it) in order to function. A simple example of such
> "resampler" would be something that averages each incoming sample with
> the previous one.
> 
> You have pushed 11 samples into the resampler. You say that the
> resampler has consumed one sample for its internal buffer, consumed 10
> more samples "for good output" and produced 10 output samples. And
> this is the point I don't quite agree with.
> 
> What you describe is one possible behaviour (and I'd say a buggy one,
> but it does exist), but we need to consider one more possibility. The
> other case is that the resampler produces 11 samples when fed 11
> samples. The first sample in my example is the average of zero and the
> first input sample, and it's technically wrong to throw it out,
> because this would mean a change in the zero-extended output from
> prepending an all-zero sequence to the input. I.e. resamplers should
> by default, at the beginning of the stream, treat internal buffers not
> as empty, but as full, pre-filled with zeros. See also the note at the
> end of this mail.
> 
> But let's say that, depending on the implementation, the read index
> might be either N or N+1.
> 
> > Now the sink is rewound by 10 samples. This means that the sink will
> > want the next written sample to be from index N-10. The resampler drops
> > the buffered sample, and the read index of the stream moves back by 10
> > samples to N-9. The sample at N-10 got lost, the user hears audio
> > skipping by one sample.
> 
> "the read index of the stream moves back by 10 samples to N-9" is of
> course wrong, as you point out below.
> 
> > The amount of dropped audio in the resampler buffer should have been
> > added to the amount by which the stream read index was moved back.
> 
> Sure. The confusion actually comes from your attempt to second-guess
> behind the resampler's back which input samples it wants due to a
> rewind. Actually, for non-1:1 resamplers, the amount of buffering is
> variable in time. E.g. consider a 3:2 downsampler that works by linear
> interpolation:
> 
> Y[0] = X[0]
> Y[1] = (X[1] + X[2]) / 2
> Y[2] = X[3]
> Y[3] = (X[4] + X[5]) / 2
> Y[4] = X[6]
> Y[5] = (X[7] + X[8]) / 2
> and so on
> 
> Sometimes, when fed a sample, it can copy the sample to the output,
> and sometimes it will average two neighbouring samples. Sure you can
> second-guess after this particular resampling pattern, but now
> consider another valid case of a linear-interpolation 3:2 downsampler:
> 
> Y[0] = (3 * X[0] + X[1]) / 4
> Y[1] = (X[1] + 3 * X[2]) / 4
> Y[2] = (3 * X[3] + X[4]) / 4
> Y[3] = (X[4] + 3 * X[5]) / 4
> Y[4] = (3 * X[6] + X[7]) / 4
> Y[5] = (X[7] + 3 * X[8]) / 4
> and so on
> 
> which always has to look ahead. Same maximum "buffer" length, same
> resample ratio, different input needs. E.g., in the first case, you
> have to know the input up to X[3] to determine Y[2], while in the
> second case you need to know one more input sample.
> 
> So let me repeat - don't attempt to guess which input samples the
> resampler will need after a sink rewind, you always will be wrong. Let
> the resampler implementation decide (i.e. you just have to implement a
> "pull" model instead of "push" if you allow arbitrary sink-based
> rewinds that need a rerun of the resampler), this naturally leads to
> the need to forward all rewind requests to the particular
> implementation. But see below for an alternative that you have
> correctly suggested, based on snapshots.

I think we are talking about different things. I was talking
specifically about the handling of the leftover buffer that we currently
have in pa_resampler. If the resampler backend does buffering, which you
seem to be talking about, that's a different (although in some sense
very similar) problem to solve, and I think we have an agreement about
the way to deal with it with state snapshots. (Nobody has volunteered to
implement the solution, though, but it's still good to have a plan.)

> > "The initial phase" means the last output sample relative to the new
> > position, right?
> 
> Yes. In the above 3:2 examples, it means whether an even or an odd
> output sample is produced. And if you look carefully, you will notice
> that the two examples above differ only by the shift by 1/4 input
> sample, so that can also be counted as a phase difference.
> 
> > It might be feasible to add the required functionality to the stock
> > resamplers. When we discussed the filter rewinding, you mentioned the
> > idea of maintaining a history of filter state snapshots. Taking a
> > snapshot only requires a function for copying the filter state, and I
> > would guess that adding such function to the stock resamplers could very
> > well be done.
> 
> True. It would also be convenient to store the read index of the
> stream and the write index of the sink along with the snapshots. This
> way, you just search for the latest snapshot that happened before the
> piece of the history that you want to amend, and continue from there,
> and this works for arbitrary buffering requirements of the resampler.

Ack.

-- 
Tanu