[Clipart] REPOST: openclipart packages and the idx file

Nathan Eady eady at galion.lib.oh.us
Wed Apr 13 14:38:53 PDT 2005

Mike Traum wrote:

 > I haven't heard anything back about this yet.
 > I think it would make a lot of sense to have a well-structured index
 > file in your packages so that other applications can be more easily
 > written to take advantage of your efforts and improve the user
 > experience when dealing with openclipart.
 >> With regards to the keywords.idx file in the package, can you start
 >> making that an xml file? It's a very perl centric file, which
 >> make's it difficult for others (like myself) to write tools that use

 >> your packages.

Okay, I'll dewarnock it.

The existing keywords.idx file was _mainly_ intended to be used by the 
keyword search tool (though of course it may have other uses).  It 
exists largely because it was very easy to create.  The code that writes 
it is about two lines long.  (That is just writing the index file; the 
data are collected as part of a larger metadata processing operation 
called authority control.)  It is also very practical for the keyword 
search tool, because, again, the code that reads it is one line long.

I can see the value of having an XML index, but should it index just the 
keywords, or also authors, titles, and other metadata?  Also it should 
be named with an .xml suffix probably, something like index.xml or 
similar.  Anyone is welcome to write a tool that creates such an index, 
and presumably if the tool existed we would roll it into the release 
procedure so that the release packages would include the index.  If such 
a tool were written in Perl, it could just read the existing index in 
one line and then walk the data structure; if it were written in another 
language, it would  have to duplicate some of what the authority control 
script does, in walking the actual collection and reading the actual 
metadata from the images -- which means also duplicating SVG::Metadata. 
  In other words, it would be significantly easier to write it in Perl.

I might get to it myself, eventually (in which case I would probably 
just roll the functionality into the existing authority control script), 
but at this time, I have several more urgent things on my personal TODO 
list for this project.  Primarily, we really need to iron out those 
issues with the upload script and image validation, sooner rather than 

More information about the clipart mailing list