pitch element breaks lip sync in Chrome
Juan Navarro
juan.navarro at gmx.es
Thu Sep 6 17:11:58 UTC 2018
Hi,
It seems that the 'pitch' filter (in the soundtouch plugin,
gst-plugins-bad) introduces some kind of wrong timestamp, clock skew,
bad sequence, a combination of those, or maybe something different (but
probably related).
I'm seeing an accumulative delay in Chrome between video and audio in a
WebRTC call (_not_ using the new GStreamer's WebRTC element, yet) where
the source is sending a video+audio stream, and the audio is filtered
with the pitch element. Chrome is unable to perform the lip sync
successfully, and for some reason deduces that somehow the audio is
lagging behind the video (which is actually not), so it delays the video
indefinitely until the delay gets to Chrome's maximum, 10 seconds.
The net effect of this issue is practically the same as what happened in
this Chrome bug:
https://bugs.chromium.org/p/webrtc/issues/detail?id=5456 (just check the
screenshots)
with 'googCurrentDelayMs' and 'goodMinPlayoutDelayMs' growing linearly.
At that time it happened to be Chrome wrongly using the webcam's
timestamp, which had a different clock rate than the system's timestamp.
But, in this case I don't think it's a Chrome bug; I have verified that
this is caused by the GStreamer's 'pitch' filter, by sourcing this
simple test pipeline to my custom WebRTC source element:
... -> (raw audio)
-> audioconvert -> audioresample -> pitch ->
-> audioconvert -> audioresample -> WebRTC
(Probably most, if not all of those audioconvert/audioresample elements
are not needed, I added them just to fall on the safe side)
This generates the mentioned delay in the video presentation handled by
Chrome.
However nothing of this happens if the 'pitch' element is removed and
any other is used, e.g. an 'scaletempo' element:
... -> (raw audio)
-> audioconvert -> audioresample -> scaletempo ->
-> audioconvert -> audioresample -> WebRTC
This produces a normal lip sync result in Chrome. Delay (latency) stays
at around 100, 150 ms.
I've been reading google's WebRTC code, wanting to know exactly what is
the name of the value that is to blame:
https://cs.chromium.org/chromium/src/third_party/webrtc/video/stream_synchronization.cc
but finding out what is the correct function chain is difficult, and I'm
still not sure of exactly *what* is making Chrome confused and wrongly
assuming that the audio is behind the video, when it's not.
I have cherry-picked and applied all commits that touched the file
'./ext/soundtouch/gstpitch.cc' into a custom built version of
gst-plugins-bad, but the issue persists so it's not a matter of trying
the latest code (after a lot of time without changes, the pitch filter
received some patches in June so I wanted to test if those helped...)
Only idea I have is that the pitch element is missing some sequence
number handling, or something about the pipeline's clock rate... but I'm
out of ideas.
Please help :)
More information about the gstreamer-devel
mailing list