[gst-devel] On the plugin cache

Behdad Esfahbod behdad at behdad.org
Thu Nov 6 03:51:33 CET 2008


Hi,

So, I blamed the gst plugin cache in my blog entry [1] for taking some 35ms on
every gst-using app startup.  It's only fair that I follow up with how I think
this should be fixed.

>From what I understand quickly looking through the code, here is how the
startup cache check currently works:

  1) Fork

  2) In the child, stat all plugin dirs and all plugins in the recursively

  3) If any plugins are newer than the binary cache, rebuild the cache

  4) In the parent, wait for child to finish.  If child failed to finish
cleanly, repeat 2 and 3 in the parent.


Lemme make this clear: yes, I know there's an option to disable forking.  But
that's beside the point.  I'm interested how this thing normally works, and
should work.  The rationale for rebuilding the cache in a forked child is legit:

  a) avoid crashing the parent,

  b) avoid polluting the parent process with lots of libraries.


Anyway, there are multiple problems with that scheme that I see.  In no
particular order:

  P1) Plain fork() is not clean.  Use g_spawn instead.  This has to do with
how the SIGCHLD is interpretted, etc.  As I understand, gst may generate a
SIGCHLD that is then left may trigger a handler installed by the user, using
glib or directly.  To handle the direct case, save/restoring SIGCHLD handler
may be needed.  I think handling SIGPIPE may be needed too.  Easy fix.

  P2) Stating plugins is safe in the parent and need not happen in a forked
child.  That obviates the need to fork in the common case.

  P3) *If* the child failed to update the registry (say, it crashed because of
a bad plugin), then the parent goes ahead and tries the same thing in the
parent process, expecting a different result!  That's plain wrong and almost
surely will crash the parent (heck, vuntz was facing this very same issue
today in gnome-settings-daemon).  If child fails, parent should simply print a
warning and proceed with using the old cache.

  P4) This is the main problem: the whole purpose of the cache should be to lt
us avoid scanning all plugins on each startup.

Lets look into why the scan is needed right now.  Let me also note that the
case at hand is *exactly* the same as the one we face in fontconfig with fonts.

So how should the cache work?  By comparing the timestamp of each plugin dir
to the recorded timestamp of that dir in the cache.  One must compare
timestamps for equality, not for being more recent as that is prune to clock
skew false negatives.

Also note that when I say all plugin dirs, that includes any recursively found
directories.  For the record, there are two schools of thought about how to
handle recursive directories:

  - Fully automatic: like fontconfig.  Record and check timestamp for all
directories found recursively.

  - Half automatic: like gtk-icon-cache.  Only record and check timestamp for
toplevel directories.  Requires every plugin install to also touch the
toplevel plugin directory.


So why does just checking the timestamp of all plugin dirs work?  Because:

  - File and directory add, move, and deletions are noticed by their parent
directory, hence detected by our code.

  - File copies and otherwise modifications are NOT detected.  BUT, such
things are not allowed anyway:  If you modify a file mmapped by another
process, you are going to crash the other process that is using the file.
Installs should always be done by the (fortunately, atomic) move operation,
not copy.

Also worth mentioning is that dumping the old cache and plugins is safe with
respect to other processes: A file is not deleted from disk as long as some
process holds an open mmap on it.

So, that's it.  It should all work by just stating directories, not files.  In
the case of fontconfig, it actually keeps the cache for each directory's fonts
in a separate cache file.  That's a tradeoff: more cache files, but cheaper
regeneration.  gst may want to keep separate cache files for system and user
plugins too.  That means, distro packages installing plugins can update the
system-wide cache once and each user does not have to do that.  Users would
need to update cache only for plugins they install in their home dir, or if
the system-wide cache is outdated (which is a distro bug if it does)


Regards,

behdad


[1] http://mces.blogspot.com/2008/10/improving-login-time-part-1-gnome.html






More information about the gstreamer-devel mailing list