problem with synchronization between multiple audio clients

Thu Nov 14 02:51:52 PST 2013

On 2013-11-13 15:35, Javier Domingo wrote:
> Hi Tim!,
>
> Thanks a lot for the info! I was just wondering whether if the RTP[1]
> standard would be capable of that. I had already seen aurena, but I
> prefer the RTP solution. I will be doing experiments with it though,
> and report back if I encounter something interesting,
>
> Cheers,
>
> Javier Domingo Cansino
> Research & Development Junior Engineer
> Fon Labs Workgroup, Getxo - Spain
>
> [1] RTP Standard: http://tools.ietf.org/html/rfc3550
> _______________________________________________
> gstreamer-devel mailing list
> gstreamer-devel at lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/gstreamer-devel

RTP is definitely capable of this. We use RTP/RTCP in our software stack 
for this very purpose, and the phase shift between the receivers is 
typically less than 2ms over Wi-Fi. It could be even better, but this 
would require a thorough analysis of the entire audio sink layer. A very 
useful tool for this purpose that has been proposed is a test audio 
sink, together with a test audio clock, to be able to formulate unit 
tests and to try out the synchronization algorithms in the sink, be able 
to plot results etc.

If you want synchronized audio receivers with GStreamer, you need three 
components: the synchronized clock, the RTP/RTCP logic, and the 
synchronized sink.
The first one is easy: just use the GstNetClientClock on the receiver 
and the GstNetTimeProvider on the sender. The second is given to you by 
the rtpbin. The third is tricky though. I will elaborate on this a bit. 
Perhaps some information is redundant, I'm not sure:

The audio sink has to compensate the difference between clock speeds 
(pipeline clock vs. audio clock). The audio hardware often has its own 
Quartz crystal, and no two crystals oscillate with the exact same 
frequency. Plus, if your receiver's pipeline clock is synchronized to 
the sender's, its speed will differ from the audio clock's anyway. If 
this is not compensated, an drift between the clocks will build up, 
since playback on the receiver may be a bit faster, or a bit slower, 
compared to another receiver.

The sink calls the compensation modes "slave-methods", as in "slaving 
the audio clock to the pipeline clock". The default is to skew the 
playout pointer; the audio sink lets the drift accumulate, until a 
certain threshold is reached, at which point the sink moves the current 
playback position inside its ring buffer, effectively "cutting out" or 
"skipping" samples. While this works, it introduces artifacts, which are 
sometimes audible. Also, it is not accurate. Overall, the measured 
"skew" (= the current drift between the clocks) is *very* noisy. The 
audio sink internally tries to compensate for that by using a running 
average, but it often is not enough. So the method will skew the pointer 
more often than it should, based on incorrect data. The likelihood of 
this happening depends directly on the drift-tolerance property value, 
which is the aforementioned threshold. Try to set it to 500 microseconds 
for example, and run a pipeline like: GST_DEBUG=*audiobasesink*:2 
gst-launch audiotestsrc ! alsasink provide-clock=false 
drift-tolerance=500 . You will get lots of warnings about the sink 
skewing the pointer.

If you want very low average drift, and no "cuts", you'll need to 
compensate the drift yourself. I submitted a patch to allow for 
installing a custom callback where you can install your own compensation 
algorithm. We use a PLL on our hardware to increase/decrease the audio 
clock's frequency. This combats the drift in a much more fine-grained 
way. Another option would be to use an asynchronous resampler, which 
allows fluctuations of the input or output sample rate by a few ppm.

One strange problem that is left unanswered is the presence of an 
initial drift. The skew value is not close to zero in the beginning as I 
expected it to be. Instead, it is quite large, so I had to come up with 
an initial phase where I cut down this drift quickly by skewing the 
playout pointer, then switching to fine-grained PLL control. Ideally, 
this initial phase would be unnecessary. This is precisely where the 
test audio sink & clock would be useful.

Also, be aware that the net client clock is susceptible to some effects 
that are rather common on Wi-Fi, like heavily delayed time sync packets 
(which arrive far too late and therefore mess up the synchronization), 
or a significant asymmetry between provider->client and client->provider 
transmission speeds. Patches for that are currently being considered.

cheers