[gst-devel] On the plugin cache

Simon Holm Thøgersen odie at cs.aau.dk
Tue Nov 11 16:13:31 CET 2008


[ CC Sebastian Dröge that comitted the crc code. Sebastian, please see
  bottom of mail. ]

man, 10 11 2008 kl. 18:54 +0100, skrev Behdad Esfahbod:
> Jan Schmidt wrote:
> > On Sat, 2008-11-08 at 10:25 +0100, Simon Holm Thøgersen wrote:
> >> [ Resending as this one didn't seem to reach the list. ]
> >>
> >> ons, 05 11 2008 kl. 21:51 -0500, skrev Behdad Esfahbod: 
> >>> So how should the cache work?  By comparing the timestamp of each plugin dir
> >>> to the recorded timestamp of that dir in the cache.  One must compare
> >>> timestamps for equality, not for being more recent as that is prune to clock
> >>> skew false negatives.
> >> I completely agree with you Behdad that excessive work is being done
> >> when stating files and not just dirs. However, even with the current
> >> design it is not where most of the time is spent.
> >>
> >> The following is a profile of my laptop (Intel Pentium-m @1.5GHz)
> >> running 'gst-launch-0.10 --gst-disable-registry-fork' with 157 plugins
> >> present:
> >>
> >> total 22 ms
> >>   linking libs 4.1ms
> >>   gst_init 17.9 ms
> >>     misc 3.3 ms
> >>     loading registry.i686.bin 12.4 ms
> >>       crc 1.5 ms
> >>       creating elements 10.9 ms
> >>     stating 2.2 ms
> > 
> > That raises an interesting point - it wasn't clear from Behdad's email
> > if his system is using the (new) binary registry cache format, or the
> > (old and slower) xml registry.
> 
> I've been testing with Fedora Rawhide which has the binary cache.  Something
> around gstreamer-0.10.21-1.fc10.i386.  Maybe a bit older.
> 
> > I see something like those timings here on my machine, including the
> > 2.2-ish ms to stat things, on a machine with 174 plugins, 802 features.
> > (2.33Ghz Core 2 duo)
> 
> Sure, it may not be the stats that are taking the time but the objects you
> build from them.  It is still true that if you avoid checking on plugins on
> each startup, the cost disappears.
> 
Well, I did the profiling now, and it is the building of objects that
takes more than 55% of the time in gst_registry_binary_read_cache. I'm
sorry to say that my remark about a patch with a 75% reduction were
completely wrong.

That is not to say that such a speed up isn't possible though. The
problem is that in order for gst_element_factory_make etc. to work we
must know about all names and types of plugins. Right now the registry
builds all the features up front and put them on a list that can be
filtered, but there's really nothing to prevent using a simple index for
names and types and creating the objects lazily.

This issue is completely orthogonal to not statting anything but
directories on startup.

I'll volunteer to file the bugs and write the patches for both unless
someone give convincing arguments not to.

> >> The value of doing the crc check seems pretty dubious to me btw; if
> >> you've got disk corruptions there are plenty of other ways your system
> >> could malfunction. It should be pretty easy to make it optional at load
> >> time though.
> > 
> > I'm not sure what the rationale was for adding a CRC to the binary
> > registry format originally. It might have been done as a sanity check to
> > detect partially-written registry files, so they can be rebuilt without
> > crashing every GStreamer app.
> 
> Yeah, the CRC check is totally bogus.  The cache must be built in a temp file
> first, then moved to it's final place.  Assuming a journaling filesystem,
> there's no way you end up with a partially-written file.  The only way the CRC
> can fail is disk failure or manual modification of the file.  Neither one is
> particularly interesting.  Imagine what would happen if we did the same to
> shared objects, fonts, icons, and any other kind of binary data :).
> 
Sebastian, can you tell us the rationale for the crc check or should we just
file a bug report for the removal? I'm volunteering for this as well.


Simon Holm Thøgersen





More information about the gstreamer-devel mailing list