Fwd: Re: [Clipart] metadata: aren't the keywords actually categories (and can keywords be added)?

Jonadab the Unsightly One jonadab at bright.net
Sun Apr 17 09:59:50 PDT 2005

Mike Traum <mtraum at yahoo.com> writes:

> Andrew,
> I understand the issues you raise with the big xml file proposal.
> But, I don't think symbolic links will work, 

We have been avoiding symbolic links, because they are not very

> and I do think that hard links is a very messy solution to the
> problem. It doesn't scale well - what happens if openclipart has
> 50,000 images? So many duplications in the package will just end up
> in bloat.

I guess you weren't here yet when we discussed quality feedback
mechanisms and subset distributions.  Long term, we want to have a web
interface to receive quality feedback from the community at large, so
that smaller collections can be produced containing the "best" (i.e.,
highest-rated) n% of each category.  That however requires the DMS to
be in place, among other things, and is realistically probably at
least a year down the road at this point.

> How about a flat file structure with no path whatsoever? I think
> this would make the most sense.

That was used very early on, but it scales even worse.  With some
types of filesystems, performance goes straight to the toilet if a
single directory contains more than a few thousand files, and if
directories with thousands of files is bad for computers, it's much
worse for humans -- nobody wants to look through fifteen thousand
images in a directory every time they want to find one.  From that
standpoint, the categories are a huge usability enhancement.

> Regarding the application I'm proposing, sure, I'll be able to
> support pretty much anything. But, this all seems to be up in the
> air right now, and I'd like to see some data definitions of proposed
> xml files and a roadmap on the package structure before I start a
> project based on all of that.

Here's a thought:  What if the big XML file that you want, that would
only be useful with the specialized Java-based tool, was something
that was distributed together *with* the specialized Java-based tool,
as a part of *that* project, rather than being the primary OCAL
distribution form.  That is, whoever creates the Java-based
organizational thingy could also create the spec for the XML file that
it uses, and code that generates it, either from one of our packages,
or possibly by checking out from the DMS once that is in place.
Actually, the latter seems like the better approach in the long run.
Anyway, then you distribute the clipart organizer and the XML file
that it uses together, as a package.  Would that work?

split//,"ten.thgirb\@badanoj$/ --";$\=$ ;-> ();print$/

More information about the clipart mailing list