[Clipart] Comma-separated keywords in metadata

Bryce Harrington bryce at bryceharrington.com
Tue Jun 29 23:26:39 PDT 2004


On Wed, 30 Jun 2004, Jonadab the Unsightly One wrote:
> > Fortunately we had several artists who were very good about
> > submitting their work as a tarball with a README or other file with
> > their license, title, and author info, so it was not hard to add the
> > metadata.  We still have a number of images without metadata, but
> > unfortunately we also have no way of determining the author info.
> > See the PASSFAIL report in the tarball for details.
> 
> I see.  Hopefully having the upload script will help with this for
> future SVG submissions, but should we have a separate upload script
> (or generalize the existing one) for tarballs (and PNGs and other
> formats?), that puts the metadata in $samefilename.RDF or something?

Yes, that sounds like a very good approach.  A nice option would be if
the user wishes to include the metadata in the tarball, that they could
name it some specific name and the upload script could check for that
file and, if there, pull it out and validate it as normal.

> We've been meaning to store some stuff in a database anyway.  Didn't
> someone say there's access to MySQL?  One advantage of the db is it
> would allow us to store things that are not supported directly by
> SVG::Metadata.  Another is that we could store potentially-sensitive
> info that shouldn't go into the .SVG itself.  Another is that we could
> store metadata for non-XML submissions, such as tarballs.

Yes, I did mention that we can get access to MySQL.  I've been thinking
about this since then, though.  There is definitely an advantage to
using files as the primary data storage mechanism, and I'd like to do
that as the base.  This way, I can rsync off all the files from our
server and process them locally using basic file-oriented tools.  

For instance, if we need to delete, rename, update, etc. files, we can
do so at the filesystem level, without needing to worry about
referential integrity in a database or whatnot.

However, for online, interactive type things, pulling that data into a
database can be beneficial, for things like searches, associating with
user account records, etc.  But I would treat this as a secondary data
mechanism.  Perhaps create a script that scans the files in a directory
and loads their metadata into SQL, so that the db could be easily
generated from any arbitrary collection of files.  Then if we wished to
add to it, we would ensure that the file metadata and database are
updated in sync.

I should probably add at this point that I've been down the DMS path
before - docsys.sf.net is by me.  I found that it proved to be somewhat
more difficult for the sysadmins and users to administrate when the
database was the primary source of info, and if I did that system over
again, would make it more file-oriented, as described above.  One of the
main reasons being that this would potentially enable use of WebDAV as
an alternative file management / file viewing mechanism.  I feel that
the ability for users to access the document repository using their
normal file manager tools (File Explorer, Nautilus, a mounted file
system, etc.) would be significantly more usable than a web interface.
Likewise, a number of tools, like Adobe's and Microsoft's products,
support WebDAV, so I suspect it would "play well" with how artists would
like to work on their files.  Anyway, that's all kind of blue-sky
thoughts, but that's what I've been thinking about why the database
should be kept a secondary rather than primary data source...  Hope that
makes sense.

Bryce




More information about the clipart mailing list