[pulseaudio-discuss] VLC, PulseAudio and large tlengths

Sat Aug 20 22:13:42 PDT 2011

On Sat, 2011-08-20 at 23:31 +0300, Rémi Denis-Courmont wrote:
> > (...) I think the idea is that you can at any time query the current
> > playback latency (fixed hardware latency + currently buffered data)
> > and use this information to schedule the video frames.
> 
> That would arguably be the best way to implement a video file player.
> But the display vertical refresh is an alternative master clock. In the first 
> case, you may need to drop or duplicate frames. In the second case, you may 
> need to resample the audio signal.
> 
> Anyway VLC is built with live playback in mind (it started as a DVB-IP 
> receiver afterall). VLC uses to the input signal as the master clock (or the 
> CPU monotonic clock by default). I believe gstreamer uses a similar logic 
> though I have not checked. In fact, that is the only practical option if the 
> receiver does not control the input pace.

Right. Makes sense.

> So the audio can and does drift. This is compensated through resampling. 
> Normally VLC would do it internally. Now the PulseAudio is unique among VLC 
> audio outputs insofar as PulseAudio resamples on VLC behalf. David suggested 
> that a while ago.
> 
> > I'm not sure how downsampling is relevant here. Is the video being
> > synchronized to the wall clock instead of the audio clock and you need
> > to make the audio stream go faster to catch up with the video stream?
> 
> Currently, VLC tolerates 40 ms advance and 60 ms delay as per EBU 
> Recommendation 37. If a PulseAudio latency update indicates that playback does 
> not fall within that 100 ms sliding window, VLC changes the sample rate to try 
> to restore synchronization without glitch.
> 
> It is thus essential that the stream gets triggered approximately on time, 
> whether that is initially, upon resuming from pause, or upon recovering from 
> underflow. Otherwise, resampling kicks in and you get to hear Doppler.

The resampling shouldn't be hearable if it speeds up the stream at most
2% (or at least there are such comments in Pulseaudio source code where
similar adaptive resampling is done). I'd guess slight resampling would
be good for small drop-outs. For longer gaps the catch-up time might be
too long and the initial difference between video and audio too
noticeable, so dropping some audio would be better to get done with it
quickly.

So, maybe the strategy would be just to monitor the timing reports from
Pulseaudio and if the audio starts to lag, depending on the delay either
resample slightly or drop audio.

With this strategy I guess the problem is that if you drop audio at a
"random" time, after a severe underrun there will be two glitches: first
the gap and then a short period of audio continuing from where it was
before the gap, and then the audio skips as the synchronization gets
fixed. If I've understood correctly, you'd like to implement the
underrun recovery so that audio is dropped immediately after the gap in
output, so there would be only one glitch. This might or might not be
somehow doable with stream underruns, but it's too complex for me to try
to come up with a solution. I don't think it's a very bad bug if the
recovery from a severe underrun isn't as smooth as possible.

In case of a sink underrun (scheduling problem at pulseaudio's end - we
don't fill the hw buffer in time) you won't get any notification about
the gap anyway (beyond the timing info), so there's nothing you can do
but drop audio at a random time (or resample).

-- 
Tanu