[Clipart] Clip Art Navigator 0.31

Jonadab the Unsightly One jonadab at bright.net
Tue Aug 23 06:46:27 PDT 2005

Greg Steffensen <greg.steffensen at gmail.com> writes:

> The index file currently uses python's internal serialization format
> (which python calls "pickling").  I did that mainly for performance;
> its faster to load the relevent datastructures from that frozen format
> than to recreate them by parsing xml.  

Ah.  So this is similar to the perl-centric keywords.idx, which is
written with Data::Dumper.  If there is a significant performance
difference, it may be worth doing it that way...

> I suspect that the speed difference here is irrelevent though; both
> are plenty fast in practice; I'll do some benchmarks tomorrow.  


> I saw index.xml, but my original reason for not using it was that I
> wanted users to have the flexibility to add additional content to
> their local clip art store.  In retrospect, the best way to do that
> is to allow them to create their own indexes, but in ocal's xml
> format.  Is there a tool to do this already in the ocal tools
> package?

The code that generates the index.xml is in the tools, in the file
clipart-authority-control.pl.  However, that tool also does a number
of other things and is fairly developer-oriented.  Also, it's slow.
Part of the reason it's slow is because it does a lot of stuff, but
the other part of the reason is that I never optimized it at all,
because we typically run it once or twice a month, as part of the
release cycle, and my workstation spends a lot of time idle anyway.

Additionally, I _suspect_ (although I haven't profiled it) that a lot
of the slowness is inherent in the fact that it has to read and parse
every SVG file in the collection, which is getting to be rather a lot
of files.  I _suspect_ that it's I/O-bound for most of its runtime.

> Letting the packagers do the indexing also has the advantage of not
> using python's xml parser (expat), which was unable to parse around
> 30 of the 0.16 images.

I suspect that SVGscan (see other thread) probably notices this.

Open Clip Art Library:  Drawing Together

More information about the clipart mailing list