[pulseaudio-discuss] [RAOP] [PATCH] Fix audio synchronisation

Thu Sep 7 11:54:45 UTC 2017

On Wed, 2017-09-06 at 19:17 +0200, Colin Leroy wrote:
> On 06 September 2017 at 14h59, Tanu Kaskinen wrote:
> 
> Hi Tanu, 
> 
> > > +    latency += 2000000; /* RAOP default latency */
> > > +    latency += pa_raop_client_get_rtp_diff(u->raop) * 1000; /*
> > > plus last packet length in usec */  
> > 
> > pa_raop_client_get_rtp_diff() returns the size of the last audio chunk
> > in bytes divided by four, but you treat that as a time value. That's a
> > clear bug.
> 
> You may well be right! I'm not quite used to the RTP protocol, but it
> seemed to me that rtptime is fed into the 'timestamp' part of the
> RTP header and that it represented a time - from what I gathered from
> the spec.
> 
> I derived rtpdiff from the previous way of incrementing rtptime:
> 
> > -    c->rtptime += length / 4;
> > +    c->rtpdiff = length / 4;
> > +    c->rtptime += c->rtpdiff;
> 
> it is indeed incremented in a way that I don't understand, though, so
> I'm not sure about that.

length is the amount of bytes in the memchunk that contains the audio
that needs to be sent. I don't know for sure why it's divided by four,
but I would guess it's because typically the audio will have 16-bit
samples and two channels, so 4 bytes per PCM frame. If that guess is
right, then there's a bug, because the RAOP sink can have different
settings: by default the sink is configured using
core->default_sample_spec, and module-raop-sink also supports the
"format" and "channels" module arguments. This configurability should
be removed, because write_ALAC_data() seems to be assuming stereo
audio, and probably also S16_LE sample format (although the code isn't
clear about this). You can test this by passing different "format" and
"channels" values to module-raop-sink.

I don't know what the timestamp field in RTP is actually supposed to
contain, and whether the receiving device even cares about the
timestamp.

> > Adding the size of the last packet doesn't make sense to me anyway,
> > though. The size of the last packet is already taken into account in
> > u->write_count. Now you're counting it twice.  
> 
> My reasoning was that the "last" (I should have commented "current",
> maybe) packet doesn't get played before the next one is received, in
> which case it makes sense that it adds an "packet duration" latency.

Why would the next packet have any effect on whether the last packet
gets played or not? If you've sent a packet, and the next packet never
comes, surely the last packet will still get played?

The last packet does contribute to the latency, but as I said, is
already counted in u->write_count that is used when calculating the
latency.

> I did find the audio to be better synchronised to the video with this
> patch, than with stock PulseAudio and shifting audio -2.3s in mplayer, 
> which is what made me think that calculation made sense.

If you can confirm that no matter what constant you use,
synchronization works better if you count the last packet twice, then
we can count the last packet twice and add a comment that this is some
black magic that isn't supposed to work.

> > If the 2 second constant results in too low latency, I think the
> > constant should be made bigger.
> 
> If this seems too much like guesswork to you, I can understand that and
> change the patch to set the latency empirically to a bigger value
> (around 2.3 seconds). I can also, if we go this route, set the default
> latency empirically and add a parameter to the module to allow changing
> it.

A parameter for the latency constant would be nice indeed.

-- 
Tanu

https://www.patreon.com/tanuk