GStreamer pipeline udpsrc randomly stops receiving UDP packets

Mikael Nousiainen mikaelnousiainen at fastmail.com
Sun Feb 26 10:05:07 UTC 2023


I managed to get some additional info on where the pipeline MIGHT stop working.

I got this stack trace from gst-launch-1.0 (version 1.18.4) using GDB:

#0  futex_wait_cancelable (private=0, expected=0, futex_word=0x1fcc0dc) at ../sysdeps/nptl/futex-internal.h:186
#1  __pthread_cond_wait_common (abstime=0x0, clockid=0, mutex=0x13, cond=0x1fcc0b0) at pthread_cond_wait.c:508
#2  __pthread_cond_wait (cond=0x1fcc0b0, mutex=0x13) at pthread_cond_wait.c:638
#3  0xb5048864 in pa_threaded_mainloop_wait () at /lib/arm-linux-gnueabihf/libpulse.so.0
#4  0xb5071738 in gst_pulseringbuffer_commit (buf=0x80, sample=0x906e7b5c, data=0x220d938 "", in_samples=<optimized out>, out_samples=<optimized out>, accum=0xb00fdc08) at ../ext/pulse/pulsesink.c:1585
#5  0xb53578d4 in  () at /lib/arm-linux-gnueabihf/libgstaudio-1.0.so.0

Looks like the pipeline might be stuck in pulsesink.c:1585 -- I checked the stack trace multiple times over a couple of minutes, exiting GDB in between to give GStreamer time to proceed. I got the same result every time.

I can see that the line 1585 is this one in GitHub (for version 1.18.4):

https://github.com/GStreamer/gst-plugins-good/blob/1.18.4/ext/pulse/pulsesink.c#L1585

So it may be that it's the pulsesink that gets stuck (instead of udpsrc).

Based on the code it looks like there is no space to write to the PulseAudio sink...?

    /* we can't write segsize bytes, wait a bit */
    GST_LOG_OBJECT (psink, "waiting for free space");

    pa_threaded_mainloop_wait (mainloop); // <- this is line 1585

What could this mean? The sound card is still there, PulseAudio is working (audio can be recorded all the time) and restarting the pipeline will fix playback too!

For reference, here are the complete stack traces from gst-launch-1.0:

Using host libthread_db library "/lib/arm-linux-gnueabihf/libthread_db.so.1".
__GI___poll (timeout=-1, nfds=2, fds=0x2251ee0) at ../sysdeps/unix/sysv/linux/poll.c:29
29      ../sysdeps/unix/sysv/linux/poll.c: No such file or directory.
(gdb) info threads
  Id   Target Id                                     Frame 
* 1    Thread 0xb6f17e00 (LWP 1334) "gst-launch-1.0" __GI___poll (timeout=-1, nfds=2, fds=0x2251ee0) at ../sysdeps/unix/sysv/linux/poll.c:29
  2    Thread 0xb4a17440 (LWP 1389) "threaded-ml"    __GI___poll (timeout=1499, nfds=2, fds=0xb0103d40) at ../sysdeps/unix/sysv/linux/poll.c:29
  3    Thread 0xb00ff440 (LWP 1392) "udpsrc0:src"    futex_wait_cancelable (private=0, expected=0, futex_word=0x1fcc0dc) at ../sysdeps/nptl/futex-internal.h:186
  4    Thread 0xaf8fe440 (LWP 1393) "gmain"          __GI___poll (timeout=-1, nfds=1, fds=0x22480f0) at ../sysdeps/unix/sysv/linux/poll.c:29
(gdb) bt
#0  __GI___poll (timeout=-1, nfds=2, fds=0x2251ee0) at ../sysdeps/unix/sysv/linux/poll.c:29
#1  __GI___poll (fds=0x2251ee0, nfds=2, timeout=-1) at ../sysdeps/unix/sysv/linux/poll.c:26
#2  0xb6cba988 in  () at /lib/arm-linux-gnueabihf/libglib-2.0.so.0
(gdb) t 2
[Switching to thread 2 (Thread 0xb4a17440 (LWP 1389))]
#0  __GI___poll (timeout=1499, nfds=2, fds=0xb0103d40) at ../sysdeps/unix/sysv/linux/poll.c:29
29      in ../sysdeps/unix/sysv/linux/poll.c
(gdb) bt
#0  __GI___poll (timeout=1499, nfds=2, fds=0xb0103d40) at ../sysdeps/unix/sysv/linux/poll.c:29
#1  __GI___poll (fds=0xb0103d40, nfds=2, timeout=1499) at ../sysdeps/unix/sysv/linux/poll.c:26
#2  0xb5047f90 in  () at /lib/arm-linux-gnueabihf/libpulse.so.0
(gdb) t 3
[Switching to thread 3 (Thread 0xb00ff440 (LWP 1392))]
#0  futex_wait_cancelable (private=0, expected=0, futex_word=0x1fcc0dc) at ../sysdeps/nptl/futex-internal.h:186
186     ../sysdeps/nptl/futex-internal.h: No such file or directory.
(gdb) bt
#0  futex_wait_cancelable (private=0, expected=0, futex_word=0x1fcc0dc) at ../sysdeps/nptl/futex-internal.h:186
#1  __pthread_cond_wait_common (abstime=0x0, clockid=0, mutex=0x13, cond=0x1fcc0b0) at pthread_cond_wait.c:508
#2  __pthread_cond_wait (cond=0x1fcc0b0, mutex=0x13) at pthread_cond_wait.c:638
#3  0xb5048864 in pa_threaded_mainloop_wait () at /lib/arm-linux-gnueabihf/libpulse.so.0
#4  0xb5071738 in gst_pulseringbuffer_commit (buf=0x80, sample=0x906e7b5c, data=0x220d938 "", in_samples=<optimized out>, out_samples=<optimized out>, accum=0xb00fdc08) at ../ext/pulse/pulsesink.c:1585
#5  0xb53578d4 in  () at /lib/arm-linux-gnueabihf/libgstaudio-1.0.so.0
(gdb) t 4
[Switching to thread 4 (Thread 0xaf8fe440 (LWP 1393))]
#0  __GI___poll (timeout=-1, nfds=1, fds=0x22480f0) at ../sysdeps/unix/sysv/linux/poll.c:29
29      ../sysdeps/unix/sysv/linux/poll.c: No such file or directory.
(gdb) bt
#0  __GI___poll (timeout=-1, nfds=1, fds=0x22480f0) at ../sysdeps/unix/sysv/linux/poll.c:29
#1  __GI___poll (fds=0x22480f0, nfds=1, timeout=-1) at ../sysdeps/unix/sysv/linux/poll.c:26
#2  0xb6cba988 in  () at /lib/arm-linux-gnueabihf/libglib-2.0.so.0

-Mikael

On Fri, Feb 24, 2023, at 11:28, Mikael Nousiainen via gstreamer-devel wrote:
>> 1.18 is fairly old, it would be great if you could try a newer version
>> such as 1.22.x.
>
> I'm using Debian Bullseye and so far haven't found anything newer.
> Are there any places with newer binaries for Debian distributions?
> Compiling all of GStreamer sounds like a big task... :)
>
> That said, the same bug has been precent for at least 4 years,
> but for some reason this has become more prominent recently
> with the pipeline stopping more often.
>
>> 
>> One thing you could do after you noticed it stopped receiving packets
>> is to attach a debugger such as gdb to gst-launch-1.0 via gdb -p `pidof
>> gst-launch-1.0` and then get a stack trace of all threads to see where
>> they're stuck and what they're waiting on.
>
> I'll try this... once I get another failure!
>
>> You could write a little application in C or python that creates a
>> pipeline using gst_parse_launch() and starts it up.
>> 
>> With an application you can enable the ring buffer logger
>> (gst_debug_add_ring_buffer_logger) to continuously log into memory.
>> 
>> You could then use a watchdog element in your pipeline to detect when
>> data stops flowing, at which point you can then grab the last X MB or
>> seconds of debug log from the ringbuffer and write it somewhere.
>
> I'm afraid this gets a bit complicated.
>
>> 
>> Cheers
>> ?Tim
>
> -Mikael
>
> On Wed, Feb 22, 2023, at 11:54, Mikael Nousiainen wrote:
>> I've got a working pipeline that streams RTP/Opus audio from Janus 
>> Gateway to a PulseAudio sink.
>>
>> When the pipeline works, everything is fine, even in terms of latency.
>>
>> The pipeline command is:
>>
>> gst-launch-1.0 udpsrc address=127.0.0.1 port=22101 reuse=FALSE 
>> caps="application/x-rtp" ! rtpopusdepay ! opusdec ! audio/x-raw, 
>> rate=48000, channels=1, format=S16LE ! audioconvert ! audioresample ! 
>> pulsesink 
>> device="alsa_output.usb-BurrBrown_from_Texas_Instruments_USB_AUDIO_CODEC-00.analog-stereo"
>>
>> However, the pipeline stops working randomly. The time frame could be 
>> 10 minutes or 2 weeks,
>> but eventually gst-launch-1.0 process stops receiving UDP packets and I 
>> see them piling up
>> in the Recv-Q (receive queue) when checking netstat. I have also 
>> confirmed that the UDP packets
>> keep on coming from Janus Gateway even if the pipeline fails to receive 
>> them (that's why they end up in the queue).
>>
>> See netstat output when the pipeline is NOT working (see high Recv-Q value):
>>
>> Active Internet connections (servers and established)
>> Proto Recv-Q Send-Q Local Address           Foreign Address         
>> State       User       Inode      PID/Program name    
>> tcp        0      0 172.20.0.2:49376        172.20.0.1:4713         
>> ESTABLISHED 1000       4026621    650/gst-launch-1.0  
>> udp   173888      0 127.0.0.1:22101         0.0.0.0:*                   
>>         1000       4023945    650/gst-launch-1.0  
>>
>> See netstat output below when the pipeline is working fine:
>>
>> Proto Recv-Q Send-Q Local Address           Foreign Address         
>> State       User       Inode      PID/Program name    
>> tcp        0      0 172.20.0.2:49436        172.20.0.1:4713         
>> ESTABLISHED 1000       9268042    1710/gst-launch-1.0 
>> udp        0      0 127.0.0.1:22101         0.0.0.0:*                   
>>         1000       9270388    1710/gst-launch-1.0 
>>
>> The RTP/Opus UDP packet stream from Janus Gateway is not constant, 
>> meaning that it may stop if there is no audio to be played. However, 
>> I've noticed that the pipeline issue does not correspond with packets 
>> not being present, as sometimes the failure happens right after 
>> restarting the pipeline while audio is available.
>>
>> Restarting the pipeline process always fixes the issue.
>>
>> I've checked that the gst-launch-1.0 process is still somehow alive, 
>> because it does send some sort of "keep-alive" packets to PulseAudio 
>> (via UNIX socket) even after stopping to process UDP packets.
>>
>> I've also got a similar pipeline working in the opposite direction, 
>> streaming audio from PulseAudio source and sending it as an Opus/RTP 
>> audio stream to Janus Gateway and that pipeline NEVER has any issues.
>>
>> There is no output from gst-launch-1.0 process when the UDP reception 
>> stops and I have not noticed any kernel messages at those times either. 
>> It is difficult for me to enable very verbose logging in 
>> gst-launch-1.0, as the issue might take weeks to show up and there's 
>> really no easy way to find space for the verbose logs.
>>
>> Would you have any ideas what could stop UDP packet reception in 
>> GStreamer udpsrc? I've attempted to alter the "reuse" parameter, but it 
>> doesn't seem to have any effect.
>>
>> Could it still be that some other process can "steal" the UDP stream 
>> from gst-launch-1.0? I'm out of clues here :)
>>
>> GStreamer version details:
>>
>> # gst-launch-1.0 --version
>> gst-launch-1.0 version 1.18.4
>> GStreamer 1.18.4
>> http://packages.qa.debian.org/gstreamer1.0
>>
>> Thanks,
>> Mikael Nousiainen


More information about the gstreamer-devel mailing list