Solving RTP audio / video drift

David Jaggard davywj at gmail.com
Wed Nov 18 04:39:03 PST 2015


I am using gstreamer to demux a live mpegts stream containing h264 video
and aac audio and transcode both the audio and video before passing both
out as individual multicast streams. My pipeline looks something like:

udpsrc > rtpmp2tdepay > tsdemux >
  video > queue2 > decodebin > deinterlace > x264enc > queue2 > rtph264pay
> multiudpsink
  audio > queue2 > decodebin > audioconvert > audioresample > voaacenc >
queue2 > rtpmp4gpay > multiudpsink

The transcoding steps are necessary to get the streams into the correct
format for the client.

I understand that the rtp timestamps in the video and audio streams will
have no relation to each other and synchronisation between the streams
relies entirely on when the packets are received. That said, the packets
coming out of both streams should be at the same rate as the live input
stream so even if the two streams were not in sync they should at least be
out of sync by some consistent offset.

What I am actually seeing is that both audio and video start in perfect
sync. Then over time the video starts to lag behind the audio i.e. the
video stream is running slower than the audio stream. I don't know how this
can happen as they should both be running at the same rate as the input
stream. This suggests to me that one of the queues is starting to buffer up
data.

My inelegant solution to this issue is to modify the pipeline to take the
encoded video and audio and mux them back together before demuxing them
again and outputting as rtp streams. So the end of the pipeline looks like
this:

  x264enc > queue2 > matroskamux > queue2 > matroskademux name=mux > queue2
> rtph264pay > multiudpsink
  voaacenc > queue2 > matroskamux > queue2 > mux > queue2 > rtpmp4gpay >
multiudpsink

My theory is that the matroskamux will check the presentation timestamps on
the audio and video buffers and synchronise them. Then it simply demuxes
the stream again and feeds the two rtp streams without any transcoding
steps that could introduce lag. (I may have gone overboard with queues).

Experimental evidence suggests this works and the video no longer lags the
audio. However, it seems like a terrible hack and RAM usage has shot up
from 190MB to 750MB.

Is there a more elegant solution for ensuring that the video and audio
buffers are synchronised just before pushing them to the udpsinks? Is there
anyway to inspect the queues and find out where the lag is coming from?

The pipeline at the top is built into code whereas my experimental solution
is using gst-launch.

This is running on Windows.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.freedesktop.org/archives/gstreamer-devel/attachments/20151118/3f5cbf05/attachment-0001.html>


More information about the gstreamer-devel mailing list