[Clipart] metadata: aren't the keywords actually categories (and can keywords be added)?

Mike Traum mtraum at yahoo.com
Fri Apr 15 14:36:45 PDT 2005

Acutally, MS suprised me. Their http://office.microsoft.com/clipart
site generates .mpf files, which are xml with the clip files base64
encoded in them, including all of the meta data as well. For once,
their doing something right. I would consider their site as a model
for what you would want to achieve - they have a shopping cart-like
functionality, the ability to choose thumbnail size, etc. Of course,
all of their clips come with an unfriendly EULA attached.

In case your intereset in .mpf, I've written a couple tools:
1. mpftools ( http://mpftools.sourceforge.net/ )
perl script to extract the clip files
2. oogalleryimport ( http://oogalleryimport.sourceforge.net/ )
java app to import an mpf into OpenOffice's Gallery

Anyway, when you describe the localized packages for end users, I
assume you mean something similar to what is being done now, with
clips extracted into a directory structure? If so, this is my concern
about having an image with mutiple categories (which, many will
surely have). Do you have multiple copies of the file in the tarball?
Seems redundant. You could use symbolic links, but then you'd have to
distrubute separate packages for different OS's (besides the fact
that I doubt MS shortcuts play nice with other tools, for example
image organizing applications such as ThumbsPlus). This is why I like
the idea of one big xml, assumming there was a clip organizer
application to go along with it.

The whole reason I've been lurking here is because I'm interested in
writing a os independent (java-based) GPL'd clip organizer similar to
what MS has with their Office suite. Then, you can import these
packages (as well as Microsoft's, and whoever else is dirtributing
clip packages w/ metadata) and still have them searchable on the
client side. You'd no longer need to distribute thumbnails, as the
clip organizer would be able to do this as well. And, you'd be able
to see the properties of the file (copyright, etc) from a decent
interface. This effort shouldn't really be that hard, but I want to
make sure that openclipart is moving towards a package structure that
would be ameniable to such an effort. Otherwise, I'd only be allowing
import of MS's files, which many users can't legally use (under their


--- Nathan Eady <eady at galion.lib.oh.us> wrote:

> Mike Traum wrote:
> > I see - this all makes sense now. When you do implement the
> > hierarchy-en.xml type system, I encourage you to distrubte that
> with
> > the packages. That way, a client application wanting to import
> your
> > packages (a clip organizer, for example) could digest it.
> Our intention would be for end users got get a localized package.
> We would also make the raw collection and the hierarchies and tools
> available, for the benefit of vendors and redistributors, but we
> would expect end users to prefer a localized package.
> > This brings up the question about how your packages are
> structured,
> > though. Right now, you're structing the packages with files in
> their
> > category. 
> Yes, right now, because we do not have that stuff in place yet,
> the package we are distributing is also standing in for an
> English-localized package, in addition to standing in for the
> repository system we also don't have in place yet.  Long term,
> those things will not be the same.
>  > What happends if a file is in multiple categories?
> Then it's in multiple categories.
>  > I think this is the reason Microsoft chose to distribute their
>  > clipart in one big xml file.
> You could have fooled me.  The last time I saw a Microsoft clipart
> distribution, it was a bunch of TIFFs in directories with no
> metadata at all.  (Maybe I'm just behind the times; I haven't seen
> the latest versions of every Microsoft offering.)
>  > Maybe openclipart should be doing the same once a clip art
>  > organizer application is written?
> Non sequiteur.  I envision four ways we will want to make
> the clip art available:
> * Periodic releases of the raw unlocalized collection, as a big
>    fat zipfile/tarball, plus the localization hierarchies,
>    localization and filtering tools, and other stuff.  This is
>    the form we would suggest to redistributors and vendors.
>    Think of this as like the "source code" release that
>    application developers put out, only for clip art.
> * Localized packages for end users, again probably as zipfiles
>    or tarballs, but prefiltered/localized/whatever according
>    to the requirements of a specific locale.  Think of this as
>    like the precompiled binaries that application developers
>    distribute, only for clip art.
> * Through direct access to the document management system.
>    Think of this as like the way some applications can be
>    checked out from anonymous CVS.  Sort of.
> * Through the online browse (and possibly search) interface.
> _______________________________________________
> clipart mailing list
> clipart at lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/clipart

Do you Yahoo!? 
Yahoo! Small Business - Try our new resources site!

More information about the clipart mailing list