what is the gstreamer audio synchronization resolution?

Nicolas Dufresne nicolas at ndufresne.ca
Sun Jul 28 20:34:22 UTC 2019

Le dim. 28 juill. 2019 14 h 55, <virtually_me at claub.net> a écrit :

> I have some questions about the time resolution of audiointerleave.
> I have been working with gstreamer pipelines for a couple of years to
> implement loudspeaker crossovers via LADSPA plugins. This in general
> entails
> a number of steps from source to sink, including de-interleaving the
> incoming audio, teeing into N mono channels that are processed with one or
> more LADSPA plugins, and (re) interleaving the channels into a N channel
> "output stream" that is directed to a sink. Since the wall-clock processing
> time may be longer or shorter on each channel, the element audiointerleave
> is used to correct for the various latencies of each LADSPA-processed
> stream
> automatically.
> I am concerned that the resolution that audiointerleave can achieve is too
> low. My assumption is that the code looks for an optimum time-alignment
> point on a sample-by-sample basis. Is that correct? In that case the
> resolution would be about one sample in time, e.g. for 48kHz there is one
> sample every 0.0208 milliseconds.
> Let me explain how this would negatively impact my particular application.
> A
> 3kHz tone one period is 0.33 milliseconds. Considering the phase within
> each
> period, there are 360 degrees. If the time resolution is 0.021 milliseconds
> then the phase resolution is 360deg * 0.021 msec / 0.333 msec = 33
> degrees.
> A resolution of 33 degrees is not sufficient for my needs. This is because
> delay is often used to align the wavefronts that are launched by each
> driver
> in the loudspeaker, and the phase angle between one driver and the next
> needs to be maintained regardless of any processing latencies to a
> resolution of several degrees. In my example I chose 3kHz, however, the
> resolution in terms of phase will get worse and worse as frequency
> increases. For example at 6kHz the resolution increases to 66 degrees. The
> resulting phase angle would depend on the exact latency experienced by each
> stream before interleaving, and modifying the number of LADSPA plugins (or
> any other pipeline element) could have a very large and negative impact on
> the phase angle and resulting audio performance from the loudspeaker.
> Related to this issue, I would like to implement some type of delay for
> time-alignment as part of the loudspeaker crossover. I can do this using
> e.g. audioecho or by modifying timestamps, however, one-sample resolution
> will be insufficient. I need much better resolution.
> I would like to know what approaches might overcome this problem. If I
> increase the sample rate by N times I could improve the resolution by N
> times, however, I need an improvement by about an order of magnitude (10
> times) and such high samples rates are unachievable. Are there any other
> techniques that can be used within gstreamer to get a more fine-grained
> time
> resolution for synchronization purposes when interleaving streams?
> The only approach to get better time alignment (that can think of) prior to
> interleaving the streams would be to resample each mono stream to the
> pipeline sample rate plus a time offset that has a time resolution of ten
> microseconds or better. This would work, but would be rather
> computationally
> expensive. Is there a better or more efficient way that already exists
> within gstreamer?

That is an interesting project, indeed audiointerleave only supports
per-sample alignment. It also have configurable tolerance to clock drift,
which by default, is likely multiple samples.

I'm not aware of such a thing as sub-sample interleaving in GStreamer. This
discussion reminded me some aspect of Arun's beamforming blog. Which may of
may not be of interest here.

Of course adding such precision to audiointerleave would require a very
close look at how we perform the initial alignment, as any overclip could
be disastrous to your use case. And the an extra per stream offset will
need to be maintained. Should this be in nanosecond, and what are the best
algorithm for this, I don't know, and I'm not an expert, but I'm sure there
is a slightly more efficient way then going through massive upsampling
which would on top of adding more CPU, will also increase the memory

> _______________________________________________
> gstreamer-devel mailing list
> gstreamer-devel at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/gstreamer-devel
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/gstreamer-devel/attachments/20190728/4fe73634/attachment.html>

More information about the gstreamer-devel mailing list