[gst-devel] possible optimisations in pad_push mutex handling

Stefan Kost ensonic at hora-obscura.de
Mon Oct 11 13:35:31 CEST 2010


On 11.10.2010 11:19, Wim Taymans wrote:
> On Sun, 2010-10-10 at 21:40 +0300, Marco Ballesio wrote:
>
> ...
>
>   
>> The first positive surprise is that, even after applying the patch,
>> the following command runs properly on both the architectures:
>>
>> time gst-launch --gst-disable-registry-update audiotestsrc
>> num-buffers=60000 blocksize=128 ! "audio/x-raw-int, rate=8000,
>> width=16"  ! audioconvert ! audioconvert ! audioconvert ! fakesink
>>     
> I'm sure it will go faster but it will cause weird refcounting problems
> and crashes when you start linking and unlinking pads during
> dataprocessing, which is not acceptable.
>
> <snip>
>
>   
We should find a way to move the extra complexity to unlinking. Linking
is less of an issues as a unblocked streaming pad is linked.
Some of the complexity in pad_push/get_range is to properly error out if
one unlinks without ensuring that dataflow is blocked or stopped. The
current way how this is implemented is save, but a bit expensive.
>> Some months ago I was studying a way to detect thread boundaries from
>> within a pipeline and then, in case the application "promises" never
>> to interact with it, to choose an optimised path for gst_path_push
>> (then I dropped the work because of other tasks). Would it be
>> interesting to resume such an approach?
>>     
> Yes, it would be very interesting. I've been looking at how to do this
> for a while now. Unfortunately, most ideas involve making the object
> flags and the signal counters atomic (which is something I don't think
> we can do safely for 0.10).
>
> Maybe the easiest idea is to make an atomic 1 entry cache in
> gst_pad_push() that contains the (reffed) peer pad. With the other flag
> checks and buffer signals atomic, you can avoid taking and releasing two
> locks and 2 atomic refcounts at the expense of 6 (hopefully more simple)
> atomic operations.
>
> BTW, before you go down this route, I'm sure you can find a 2%
> performance increase elsewhere, like making the recursive locks in glib
> use native pthread recursive locks instead of the 7 method calls that it
> does now. Or maybe add less contention to the glib type lookups.. Or
> maybe improve some of the element processing functions.
>
>   
Yes, there are many places, but getting improvement in pad_push nicely
helps all over the place :)

Stefan

>> P.S. it appears the penalty decreases when buffers have bigger sizes,
>> as already shown from Felipe.
>>     
> That's not what that test showed, it showed that the more buffers you
> push per second, the more CPU you consume, which is rather obvious.
>
>   
>> Regards
>> ------------------------------------------------------------------------------
>> Beautiful is writing same markup. Internet Explorer 9 supports
>> standards for HTML5, CSS3, SVG 1.1,  ECMAScript5, and DOM L2 & L3.
>> Spend less time writing and  rewriting code and more time creating great
>> experiences on the web. Be a part of the beta today.
>> http://p.sf.net/sfu/beautyoftheweb
>> _______________________________________________ gstreamer-devel mailing list gstreamer-devel at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/gstreamer-devel
>>     
>
>
> ------------------------------------------------------------------------------
> Beautiful is writing same markup. Internet Explorer 9 supports
> standards for HTML5, CSS3, SVG 1.1,  ECMAScript5, and DOM L2 & L3.
> Spend less time writing and  rewriting code and more time creating great
> experiences on the web. Be a part of the beta today.
> http://p.sf.net/sfu/beautyoftheweb
> _______________________________________________
> gstreamer-devel mailing list
> gstreamer-devel at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/gstreamer-devel
>   





More information about the gstreamer-devel mailing list