[gst-devel] How to decrease CPU consumation for audio recording?
Felipe Contreras
felipe.contreras at gmail.com
Thu Oct 7 17:56:50 CEST 2010
On Thu, Oct 7, 2010 at 1:15 PM, Wim Taymans <wim.taymans at gmail.com> wrote:
> On Thu, 2010-10-07 at 03:08 +0300, Felipe Contreras wrote:
>> On Tue, Jan 5, 2010 at 12:40 PM, Felipe Contreras
>> <felipe.contreras at gmail.com> wrote:
>> > On Thu, Dec 24, 2009 at 7:52 PM, Wim Taymans <wim.taymans at gmail.com> wrote:
>> >> On Thu, 2009-12-24 at 19:13 +0200, Felipe Contreras wrote:
>> >>>
>> >>> GStreamer is not good at handling very small buffers.
>> >>
>> >> What do you mean with this?
>> >
>> > I mean what I said: the smaller the buffers, the worst GStreamer
>> > handles them. My gut feeling is that performance would deteriorate in
>> > exponential manner, and would be more noticeable in embedded
>> > platforms, and specially with a single core.
>> >
>> >> What do you define as a small buffer? How is
>> >> it not good?
>> >
>> > Huh? I would need to write a test application that measures
>> > performance passing buffers of different sizes along multiple thread
>> > contexts and plot the result in order to define that.
>>
>> There you go:
>> http://felipec.wordpress.com/2010/10/07/gstreamer-embedded-and-low-latency-are-a-bad-combination/
>>
>> Is it clear now that GStreamer is bad at handling very small buffers?
>
> Not really. What you are trying to say is that when you push more
> buffers per second, CPU consumption is higher. That's expected but not
> necessarily as bad as those overly dramatic graphs suggest.
My claim was that GStreamer was bad for small buffers; the smaller,
the worst. That IMO is a fact. Now, how small, and and how bad
GStreamer is depends on your system, my guess was that ARM was
specially worst compared to x86. I think the numbers show that.
My "overly dramatic" graphs show the raw data for the most minimal
example I could find, so it doesn't matter what you do, you'll get _at
least_ that performance hit. On real use-cases (in the graph after
2^7), IMO the performance lost is already bad, but you have to
multiply that by the amount of different elements and thread contexts
that are used. However, the empirical experience is already there, ask
anyone in Nokia, I just wanted to show raw numbers.
> It sounds like when you mean size, you really mean duration and thus the
> amount of buffers per second.
>
> GStreamer is not designed to pass around 1 sample per buffer (that would
> be typically 48000 buffers per second), you can do it but it will incur
> a higher overhead that increases with the amount of elements in the
> pipeline.
>
> GStreamer is however designed for more realistic buffer durations of
> 10ms (that's 100 buffers per second). The overhead that GStreamer causes
> in these types of pipelines depends on a lot of things, but in well
> designed pipelines you typically see overhead values of around 1% or
> less (callgrind and kcachegrind are good tools to measure this).
On the Nokia N900 we saw the performance hit from pushing 10ms from
one thread context to the other was around 5% of the CPU. I think
that's _bad_, you might disagree.
> Your comments about queue are correct. Queue is really causing a lot of
> contention on mutexes (it is written as a simply fifo with mutexes). If
> you use very small queue sizes, you practically force the scheduler to
> do a context switch for each buffer. Again, the more buffers per second,
> the more overhead it causes all over the place. For realistic use cases
> of a couple of 100 buffers per second and realistic buffer sizes, this
> should all perform with reasonably small overhead. That said, queue can
> be improved in many ways (add a batch mode, use a lockless queue, ...)
>
> As a datapoint: On my desktop I can push around 700000 buffers per
> second, and that's then using 100% CPU (and also 100% gstreamer
> overhead). (gst-launch fakesrc num-buffers=7000000 silent=1 ! fakesink
> silent=1 takes about 10 seconds).
On my laptop:
% gst-launch fakesrc num-buffers=7000000 silent=1 ! fakesink silent=1
22s
% gst-launch fakesrc num-buffers=7000000 silent=1 ! queue ! fakesink silent=1
45s
On my N900:
% gst-launch-0.10 fakesrc num-buffers=7000000 silent=1 ! fakesink silent=1
4m 26s
% gst-launch-0.10 fakesrc num-buffers=7000000 silent=1 ! queue !
fakesink silent=1
16m 11s
Cheers.
--
Felipe Contreras
More information about the gstreamer-devel
mailing list