Pipeline Optimization

Wed Apr 20 08:38:41 UTC 2022

On Tue, Apr 19, 2022 at 9:45 AM Nicolas Dufresne <nicolas at ndufresne.ca>
wrote:

> Le mardi 19 avril 2022 à 01:20 -0500, Matt Clark a écrit :
> > Thanks for the tip! I was wondering about those, but like I said, bit of
> a
> > newbie. Here is my new graph:
> > debug_session(sans extra converts).png
> > Seems to be working still which is always a good thing!
> > I'm all for more CPU optimizations (it's still spinning up almost 40
> threads,
>
> I count 9 streaming threads (thread induced by the pipeline design).
>
> - 3 threads for the 3 appsrc
> - 3 leaky queues
> - 1 compositor
> - 1 queue before hlssink (miss-placed btw, should be right after the tee)
> - 1 queue inside hlssink
>
> GIO will of course add couple of more threads, and some stalled thread will
> appear since all this is using thread pools. But threads that are never
> woken up
> are not a problem, it will use a bit (~2M but depends on the OS) of RAM
> each.
> Overall, the thread situation does not seems dramatic. If the compositor
> could
> be leaky, that would save you 3 threads.
>

Thank you for the explanation! In this case what would the compositor being
leaky look like in action? Would it basically just drop frames and thus
hold the same image, or would segments drop out and blink or what? If it
will hold the image through a leak that's completely fine. I just need it
to be consistent until the appsrc needs to generate a new frame, which at
the moment is only the initial frame, but in the future we plan to have
animations as well as just still frames. Not sure if that changes anything
in your mind?

>
> Memory wise you can do better for sure. All the queues can be configured
> with a
> smaller maximum size, most of them are set to default from what I see. You
> can
> also work on your encoder configuration. At the moment, it will gather
> around 32
> frames for observation and compression optimization. This likely gives
> great
> quality, but might be overkill. Be aware that appsrc also have a internal
> queue,
> which capacity can be configured. Configuring queue capacity greatly
> improves
> the memory usage.
>

Here is the new graph with how I understand some of your tweaks (please
correct me if I'm off somewhere!).
[image: debug_session (new).png]
Unfortunately, the memory and CPU footprints seem to be the same though,
even after taking all of the queues to 2 buffers max. I need the final
image quality to be as crisp as possible so it's easy for people with
vision issues to read, but don't know enough about the encoder "knobs" to
know which would be best to tweak for this. I've been reading through the
plugin entries for the encoder, and the queues and while I feel like I have
an ok grasp of the queue, I have very little grasp of the encoder so any
pointers there would be SUPER helpful.

Thank you so, so much for all the help you are giving! While I'm not where
I want to be yet, I feel like I'm starting to understand the system much
better and in the long run that will probably be even more valuable!^^

>
>
>
> > not sure if that's a lot or normal for this, honestly), but I would also
> love
> > some memory optimizations as well! After those changes each stream is
> taking
> > up about 850M of RAM while running. Again this may be normal for the
> task, but
> > a) seems like a lot to me and b) I have no frame of reference.
> > Thanks again, Nicolas!
> >
> > On Mon, Apr 18, 2022 at 8:14 AM Nicolas Dufresne <nicolas at ndufresne.ca>
> wrote:
> > > Le dimanche 17 avril 2022 à 03:29 -0500, Matt Clark via
> gstreamer-devel a
> > > écrit :
> > > > I've gotten my project mostly to the stable point of working how I
> expect
> > > > it,
> > > > however I can't help but feel that it's nowhere near optimal. I have
> made
> > > > it
> > > > work and now I wish to make it right. Any insight be it pointers or
> > > > instructions would be appreciated, as this is my first
> service/application
> > > > using gstreamer and I'm still very green with it.
> > >
> > > Just notice from the graph one low hanging fruit. You have 3 color
> > > conversion
> > > points, 1 before freeze, one after and one inside compositor. The
> output of
> > > compositor is I420, you can greatly optimize your pipeline by adding a
> caps
> > > filter to force conversion before the image freeze. This way, you
> convert
> > > the
> > > input to I420 only once. Other similar optimization in relation to the
> usage
> > > of
> > > imagefreeze could happen.
> > >
> > > > The basic explanation of the system is that it queries a variable
> number
> > > > of
> > > > web endpoints for dynamically created pngs and then composes those
> > > > together
> > > > into an HLS stream that's then used by a single client.
> > > > Here is a PNG of the pipeline graph (I'll also attach the raw SVG as
> well
> > > > in
> > > > case you want to dig into it):
> > > > debug_session.png
> > > >
> > > > TL;DR: Above is my pipeline, please help me make it the best it can
> be!
> > > > Thanks to any and all in advance!
> > > > -Matt
> > >
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/gstreamer-devel/attachments/20220420/ea694367/attachment-0001.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: debug_session (new).png
Type: image/png
Size: 544140 bytes
Desc: not available
URL: <https://lists.freedesktop.org/archives/gstreamer-devel/attachments/20220420/ea694367/attachment-0001.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: debug_session (new).svg
Type: image/svg+xml
Size: 120055 bytes
Desc: not available
URL: <https://lists.freedesktop.org/archives/gstreamer-devel/attachments/20220420/ea694367/attachment-0001.svg>