How does a muxer deal with packet loss?

Mon Jun 21 08:57:48 UTC 2021

Hi Tim et al,

Thanks for your answers, even if they are mostly "it depends" ;-) Allow 
me to elaborate!

On 20-06-2021 12:16, Tim Müller wrote:
> Hi Michiel,
> 
>> Somewhat hypothetical, but it may be what I am currently running
>> into: how do muxers deal with missing data in their inputs?
> 
> Depends on the muxer/format and also the GStreamer version you're
> using, because a muxer may have been ported to the new GstAggregator
> base class in a more recent version of GStreamer.
> 
>   
>> Consider a pair of unreliable network sources, like RTP from udpsrc,
>> or in my case a webrtcbin, with an audio and video stream. Due to
>> poor network conditions, most or all audio gets through, but lost
>> video packets mean some key frames don't make it through and
>> subsequent delta frames can't be decoded, so a few seconds of video
>> never makes it into the pipeline, but there is audio.
>>
>> After some processing and encoding, the audio and video go into a
>> muxer (mpegtsmux in my case).
>>
>> - How does the muxer deal with the time periods in which there are
>> audio buffers, but no video buffers? Does it freeze, waiting for
>> video timestamps that will never come? Does it output only audio and
>> no video in that time?
> 
> It depends. If the muxer has been ported to GstAggregator, there's a
> chance it can handle live inputs (i.e. this case) properly, that is
> continue to output data even if one of the inputs has dropped out. In
> that case it will operate in "live" mode and will timeout and process
> the data it has if it doesn't have data on all pads within the
> configured latency.

In our case, it's mpegtsmux. That extends GstAggregator, so that's good, 
right? I don't see a property anywhere to tell it whether to operate in 
live mode. Does it determine this automatically?

> However, I think that will only work if it has already received some
> input data (or rather: has received a CAPS event telling it the format
> of the input data), so the moving-on-without-data-on-all-pads might not
> work if there's data missing at the very beginning.
> 
> (I haven't tried it or double-checked the code, this is just from
> memory.)

That should be fine, I haven't had any issues at startup. Often the 
pipeline runs successfully for hours, but sometimes it freezes 
completely. There are no error messages, just no more data at the sinks.

>> - I *assume* it doesn't just freeze, but how does it know when to
>> stop waiting - maybe when a video buffer with a higher PTS arrives?
>> Does it then send out all the audio in between?
> 
> In live mode it will stop waiting after a while (configured upstream
> latency + muxer latency).
> 
> In non-live mode (e.g. file inputs, appsrc in non-live mode) it will
> need either data on all pads or a GAP event on pads where there's no
> data to make it tick along.
> 
> 
>> - Is it wise to always put a leaky queue in front of muxers to ensure
>> all pipeline branches can maintain flow?
> 
> A leaky queue is more of an "emergency valve" to make sure that an
> overflow (slow processing in one pipeline branch / downstream part of a
> pipeline) doesn't affect processing in the data producing upstream part
> of a pipeline, e.g. you might configure a leaky queue after splitting
> the pipeline with a tee element to make sure that if that branch isn't
> processing/consuming data quickly enough, that the branch pushing into
> the tee won't get blocked on that queue running full (thus starving any
> other branches that might feed off that same tee).

So in general, you would advise putting the leaky queues (if any) at the 
start of a branch, behind a source or demuxer, rather than at the end of 
a branch before an aggregator? Makes sense to drop buffers early, rather 
than waste time processing them and then dropping them.

Since this is a live video mixing/streaming application, I really need 
it to keep going no matter what, so a few seconds of missing/corrupted 
output is quite acceptable if that's what it takes. I'm still trying to 
get to the bottom of *why* it sometimes freezes, of course.

In general, every branch in the pipeline should be capable of keeping 
up, because it quickly reaches a steady state in which everything works, 
so I've been focusing on things that would disrupt this steady state. I 
don't think any of the sinks can do funky stuff (filesinks, appsinks 
with drop=true, udpsinks) so my prime suspects are the network sources.

>>   I fear that my pipeline is freezing because my video branch stalls
>> because a demuxer can't output buffers, because the audio branch is
>> blocked at the muxer downstream, because no correspondencing video is
>> available, deadlocking the system. Can that happen, and if so, how to
>> prevent it? If not, good - but why not, what prevents it?
> 
> Depends. I'm not sure I fully follow. Ideally one would want to make
> sure the muxer doesn't block, but it maybe can't be avoided initially
> when it's waiting for the input caps event.

After writing this, I thought about it some more. I was thinking of this 
kind of setup:

                     ┌────────────────────────┐
                     │                        │
             ┌───────► tiny bit of processing ├────────┐
             │       │                        │        │
             │       └────────────────────────┘        │
┌───────┐   │                                         │   ┌─────┐
│       ├───┘                                         └───►     │
│ demux │                                                 │ mux │
│       ├───┐                                         ┌───►     │
└───────┘   │                                         │   └─────┘
             │   ┌─────┐   ┌──────┐  ┌─────┐  ┌─────┐  │
             │   │     └───►      └──►     └──►     ├──┘
             └───► a lot of processing happens here │
                 │     │   │      │  │     │  │     │
                 └─────┘   │      │  └─────┘  └─────┘
                           │      │
            ┌──────────────►      │
            │              │      │
            │ ┌────────────►      │
            │ │            │      │
              │ ┌──────────►      │
              │ │          │      │
              │ │          └──────┘
                │

So a bunch of audio and video would come out of the demuxer. The audio 
takes the fast upper path, sitting at the muxer pad. The video takes the 
slow bottom path, getting decoded, scaled, mixed with other sources, etc.

My hypothetical failure scenario was that at some point, the muxer 
doesn't accept more audio data, because it needs video data first, so 
the top path is blocked. Because of that, the demuxer can't output 
anything, because it needs to push out some audio first before getting 
to the next bit of video in its input stream. But the bottom path needs 
more encoded video before producing a video buffer for the muxer.

But as long as the input stream contains "equal amounts" (timewise) of 
audio and video, and each branch has sufficient queuing capacity, that 
can't happen, right? At some point, a video buffer will pop out, arrive 
at the muxer, and progress is made.

And if for some reason no video arrives, the muxer should eventually 
just push out the audio?

> You can put a leaky queue into the branches feeding into the muxer if
> you want to make sure the muxer not processing data doesn't affect
> those upstream receiving/depayloading branches (esp. if you tee those
> off e.g. for display).
> 
> It might be possible to do something better here too.
> 
> Do you re-encode or save the audio/video as they come in over RTP?

Yes, both. It's complicated ;-)

As mentioned it's a live video mixing/streaming application. It takes 
input from three SRT sources (MPEG-TS with H265 video and AAC audio). 
That all works pretty well; I'm now adding a variable number of WebRTC 
inputs. When one of the network inputs goes offline, an input-selector 
switches its branch to an imagefreeze source. For SRT, we use its EOS 
event and for webrtcbin I switch after a timeout if the attached 
decodebin stops outputting buffers.

The inputs are demuxed and decoded (SRT mpegts also goes into a backup 
file). The raw audio and video streams are teed to two 
audiomixers/glvideomixers: one pair mixes the "real" output and one pair 
mixes the "preview" output (a mosaic of all input sources, basically).

The outputs are then teed to encoders, muxers and sinks (HLS at 3 
bitrates, an appsink for upload to S3, and udpsinks for the preview 
audio and video (to a secondary pipeline that handles webrtc output). I 
don't think any of those sinks can block.

If the muxers aren't the culprit, could it be the mixers? Those are 
aggregators, too. How do those deal with missing or late inputs?

Small lightbulb moment: I have an 80 ms latency configured on the main 
glvideomixer after reading 
https://lists.freedesktop.org/archives/gstreamer-devel/2021-February/077519.html, 
but none on the preview mixer and the audio mixers. Could that make them 
wait indefinitely if one of their sinkpads gets no data?

Or what about old data? I've had cases where webrtc packets came out of 
the pipe several seconds after closing the browser window. I don't know 
the details of RTP and PTS timestamps :-/ I would naively expect that 
something would quietly drop buffers whose PTS is in the past if they 
suddenly show up in the pipeline, but I actually have no idea if that is 
how it works. I know sinks discard buffers that arrive too late to be 
processed, but what about aggregators?

As always, happy to receive any general or specific insights!

Kind regards,
Michiel