CPU utilization by webrtcbin plugin

Pradeep Acharya pradeep.acharya1008 at gmail.com
Tue Aug 30 14:42:45 UTC 2022


Hi
Thanks a lot for your suggestions. I executed perf top for 2 participants.
The o/p looks like this. I'm trying to figure out information displayed by
the tool. it seems like the kernel takes almost 50 % of cpu and most of it
is finish_task_switch and in soft irq. other calls like malloc, free
,pad_push consume most cpu. The tool displays the result in percentage. My
guess is   5.17 % means  out of 14 % (total cpu percentage of the server),
5.17 % is consumed by finish_task_switch ?  Is that observation correct ?
If so, how could we reduce that cpu load consumed ? Even other functions
that are consuming are all related to glibc and kernel.
Samples: 70K of event 'cpu-clock:pppH', 4000 Hz, Event count (approx.):
3675894121 lost: 0/0 drop: 0/0
Overhead  Shared Object                 Symbol
   5.17%  [kernel]                      [k] finish_task_switch
   5.15%  [kernel]                      [k] _raw_spin_unlock_irqrestore
   2.96%  libc-2.28.so                  [.] _int_malloc
   2.73%  libc-2.28.so                  [.] _int_free
   2.56%  [kernel]                      [k] __softirqentry_text_start
   2.25%  [kernel]                      [k] cpuidle_enter_state
   2.01%  libgobject-2.0.so.0.6600.3    [.]
g_type_check_instance_is_fundamentally_a
   1.83%  perf                          [.] evsel__parse_sample
   1.68%  [kernel]                      [k] do_syscall_64
   1.17%  libglib-2.0.so.0.6600.3       [.] g_mutex_lock
   1.13%  [kernel]                      [k] vmxnet3_tq_xmit.isra.63
   1.05%  libc-2.28.so                  [.] malloc
   0.82%  perf                          [.] dso__find_symbol
   0.82%  libgstreamer-1.0.so.0.2100.0  [.] gst_pad_push_data
   0.77%  perf                          [.] perf_mmap__read_event
   0.74%  libc-2.28.so                  [.] cfree at GLIBC_2.2.5
   0.73%  libc-2.28.so                  [.] syscall
   0.68%  libglib-2.0.so.0.6600.3       [.] g_mutex_unlock
   0.64%  perf                          [.] 0x000000000027a1c7
   0.63%  [kernel]                      [k] nft_do_chain
   0.58%  libglib-2.0.so.0.6600.3       [.] g_slice_alloc
   0.55%  libcrypto.so.1.1.1g           [.] sha1_block_data_order_ssse3
   0.55%  [kernel]                      [k] menu_reflect
   0.49%  [kernel]                      [k] cpuidle_reflect
   0.49%  [kernel]                      [k] do_idle
   0.48%  libpthread-2.28.so            [.] __pthread_mutex_lock
   0.47%  libc-2.28.so                  [.] __memmove_sse2_unaligned_erms
   0.46%  libgobject-2.0.so.0.6600.3    [.] g_type_check_instance_cast
   0.44%  [kernel]                      [k] vmxnet3_poll_rx_only
   0.42%  libgstreamer-1.0.so.0.2100.0  [.] gst_mini_object_unref
   0.42%  [kernel]                      [k] __audit_syscall_entry

On Fri, Aug 26, 2022 at 3:51 AM Olivier Crête <olivier.crete at collabora.com>
wrote:

> Hi,
>
> I'd start by running "perf top" to know exactly what is using CPU time.
> I'm surprised that the nicesrc threads are taking so much, all this
> does is receive the packets and feed them to a queue. The queue1 thread
> is probably the one does DTLS, so it's less surprising that they would
> use CPU time. But it's hard to say anything definitive without doing
> profiling on your system.
>
> Olivier
>
>
> On Thu, 2022-08-25 at 23:51 +0530, Pradeep Acharya via gstreamer-devel
> wrote:
> > Hi ,
> > Any suggestions on reducing CPU utilization would be helpful.
> > Regards
> > Pradeep
> >
> > On Thu, Aug 4, 2022 at 7:06 PM Pradeep Acharya
> > <pradeep.acharya1008 at gmail.com> wrote:
> > > Hi,
> > > This is a query related to CPU utilization and any optimization can
> > > be done to reduce the CPU utilization.i've a media server that
> > > connects to web app applications running in the browser  . The
> > > media server uses webrtc bin plugin .
> > > I use a single socket to transmit and receive audio/video .  I've
> > > attached the png files extracted from the Dot file. Below is
> > > configuration details of the server used
> > >
> > > Num of CPU cores: 8
> > > Model: Intel(R) Xeon(R) CPU  X5650  @ 2.67GHz
> > > Audio codec: opus
> > > video codec : VP8
> > > video encoding by clients: 640 x480 @ ~ 500 kbps
> > >
> > > The server does not decode or encode the RTP but just forward from
> > > one client to another. When I execute the top command, I find that
> > > the cpu % is around 14 to 16 % just for 2 clients . The cpu
> > > utilization goes up higher and higher as the number of clients
> > > connected to the server increases. For 8 participants it goes
> > > beyond 400% , distributed among the cores. output of the top
> > > command
> > >
> > > 2329570 root     -11   0 4074924  73708  18920 S   2.0   1.3
> > > 0:01.80 queue1:src
> > > 2331060 root     -11   0 4074924  73708  18920 S   1.7   1.3
> > > 0:01.45 nicesrc1:src
> > > 2329568 root     -11   0 4074924  73708  18920 S   1.3   1.3
> > > 0:05.39 nicesrc0:src
> > > 2329569 root     -11   0 4074924  73708  18920 S   1.3   1.3
> > > 0:04.39 queue0:src
> > > 2331061 root     -11   0 4074924  73708  18920 S   1.3   1.3
> > > 0:01.38 queue2:src
> > > 2331062 root     -11   0 4074924  73708  18920 S   1.0   1.3
> > > 0:01.41 queue3:src
> > > 2329567 root     -11   0 4074924  73708  18920 S   0.7   1.3
> > > 0:01.64 rtpsession-rtcp
> > > 2329583 root     -11   0 4074924  73708  18920 S   0.7   1.3
> > > 0:00.94 rtpjitterbuffer
> > > 2331058 root     -11   0 4074924  73708  18920 S   0.7   1.3
> > > 0:00.39 appsrc_Audio_36
> > > 2331059 root     -11   0 4074924  73708  18920 S   0.7   1.3
> > > 0:00.64 rtpsession-rtcp
> > > 2331079 root     -11   0 4074924  73708  18920 S   0.7   1.3
> > > 0:00.42 rtpjitterbuffer
> > > 2331080 root     -11   0 4074924  73708  18920 S   0.7   1.3
> > > 0:00.38 rtpjitterbuffer
> > > 2329580 root     -11   0 4074924  73708  18920 S   0.3   1.3
> > > 0:00.08 timer
> > > 2329582 root     -11   0 4074924  73708  18920 S   0.3   1.3
> > > 0:01.42 rtpjitterbuffer
> > > 2329584 root     -11   0 4074924  73708  18920 S   0.3   1.3
> > > 0:00.51 rtpjitterbuffer
> > > 2331070 root     -11   0 4074924  73708  18920 S   0.3   1.3
> > > 0:00.40 appsrc_Audio_13
> > > 2331084 root     -11   0 4074924  73708  18920 S   0.3   1.3
> > > 0:00.20 rtpjitterbuffer
> > >
> > > From the above, I see that nicesrc plugin takes around 2 % of cpu
> > > of one client  I think this plugin receives packets and puts them
> > > to queue. Should this thread take 2 % cpu to receive RTP packets
> > > from socket and put them to queue ?
> > > other elements like nicesink, rtpjitterbuffer ,appsrc thread
> > > consume CPU. Features like do-nack and TWCC are enabled for video
> > >
> > > 1. Is there any benchmark of CPU utilization that the webrtcbin
> > > plugin should take for a 2 way TX and RX of audio,video RTP
> > > packets @ 500 kbps ?
> > > 2. ARe there any element property config changes that need to be
> > > done to reduce CPU utilization ?
> > > 3. How to proceed to reduce cpu utilization ? Any suggestions that
> > > help me  in figuring out are welcome
> > >
> > >
> > > Thanks
> > > Pradeep
>
> --
> Olivier Crête
> olivier.crete at collabora.com
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/gstreamer-devel/attachments/20220830/dc6c99f8/attachment-0001.htm>


More information about the gstreamer-devel mailing list