Webrtcbin: perfectly timestamped transmission of (non-live) A/V file source

Tue Mar 15 02:22:27 UTC 2022

You might want to take a look at https://github.com/centricular/webrtcsink (https://link.getmailspring.com/link/01368DED-CAB5-4AFC-96D6-72A979E45D74@getmailspring.com/0?redirect=https%3A%2F%2Fgithub.com%2Fcentricular%2Fwebrtcsink&recipient=Z3N0cmVhbWVyLWRldmVsQGxpc3RzLmZyZWVkZXNrdG9wLm9yZw%3D%3D) :)
On Mar 12 2022, at 3:27 pm, Philipp B via gstreamer-devel <gstreamer-devel at lists.freedesktop.org> wrote:
> Hi all,
>
> I am working on a somewhat experimental project to implement a "WebRTC
> media player" with gstreamer. I am aiming to stream basically any A/V
> media file over WebRTC at a quality, as near as possible to protocols
> like MPD or HLS. Obviously, latency does not play a role here, but
> there will be focus on trying to keep the playout in sync as much as
> possible between the different peers being served in parallel.
>
> I know, this is not exactly the target application for WebRTC, but
> thats also what makes it interesting as an experimental project. I
> already read about the playout delay WebRTC extension, which will be
> worth looking at at some point, but I am far from being there.
>
> At the moment, I am just trying to get the single peer stream with a
> stable LAN connection as smooth as possible. I started with video-only
> transmission, not caring too much about jitter, which was pretty
> straight-forward. Now I am looking into audio, and it becomes far more
> complicated.
>
> For now, I am testing an audio-only transmisstion, but of course I
> need to add synced video back later.
>
> I learned that filesrc does not fulfill the "live/sync" requirement to
> playout audio through rtpopuspay. There are some places on the web
> mentioning that multifilesrc combined with "identity sync=true" is
> suited to create a "fake live" media source.
>
> I played with various variants, unsure where to put the identity
> element, and the "do-timestamp" (and if its needed). As a reference,
> this is my current pipeline which may contain some clutter, but its
> about the best I got judging by audio quality:
>
> multifilesrc location=... index=0 do-timestamp=1 ! queue ! decodebin !
> queue ! identity sync=true ! queue ! audioconvert ! audioresample !
> audio/x-raw,channels=2,rate=48000 ! opusenc bitrate=128000
> frame-size=10 ! rtpopuspay pt=97 min-ptime=10000000 max-ptime=10000000
> ! webrtcbin
>
> I chose frame-size=10, because the playback artifacts are worse with
> smaller frame-sizes. So, chosing a rather small frame size, its easier
> to spot them. Once I got the issues resolved, I plan to use
> frame-size=60. min-ptime and max-ptime is something I found mentioned
> somewhere. I thought it might help to reduce jitter in rtp timestamps,
> but Im unsure about the effect.
>
> Playing this in the browser (using music input), I can hear some
> artifacts that are annoying but would be acceptable for voice as far
> as I can tell. My current focus is to get this audio as good as
> possible.
>
> chrome://webrtc-internals tells me that samples are added and removed
> to adjust timing, which explains the perceived artifacts pretty
> accurately. Looking at the decoded RTP, I can see that rtp timestamps
> are indeed slightly jittered. Ideally, I would expect the RTP
> timestamp increments to be steadily at 480. (48kHz @ 10 ms frame
> size).
>
> This is now, where I could use some help....
> What I expect to work (didnt test yet) is to use the datarate option
> of the identity element on either raw or strictly CBR encoded audio,
> to rewrite timestamps. However, then I will probably lose the original
> timing information, and with it my option to keep the video in sync.
> My guess is, in the end I need a typical media player logic to sync
> everything on audio. But obvioulsy I am too lost to find the best way
> to get there.
>
> Thanks for reading so far, I hope I could make my issue clear. Any
> pointers or comments would be useful. Some specific questions I could
> think of:
>
> - whats the easiest way to get this sorted? I mean, transmitting a av
> media file's audio continuously, with no rtp timestamp jitter, and
> with video in sync to it?
> - In the above pipeline, where exactly is my rtp jitter coming from? I
> assume "identity sync=true" is timing the throughput based on the
> sources timestamps, and the jitter is basically a quantization
> artifact because the original framesize does not match the re-encoded
> framesize. Is this plausible? That would mean, the "sync/live" nature
> of the source is not needed at all for timestamps, just for
> transmission timing (correct?)
> - Am I assuming correctly that rtpopuspay does not keep a context of
> the stream, and is creating an rtp timestamp with no knowledge about
> the last packets timestamp?
> - Are there any handy debug level filters to trace consecutive packets
> through the pipeline, regarding their timestamp?
> - Is there a concise documentation, how many and which timestamps a
> frame can actually have in gstreamer, where they are deduced from and
> what they are generally used for (only in case there is anything else
> except PTS and DTS - I am still in doubt if there is something third,
> like MPEG2TS's PCR involved here...)
>
> Thanks for any help!
> Philipp
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/gstreamer-devel/attachments/20220315/012a3e48/attachment.htm>