[Clipart] How can I "Join the Open Clip Art Library Release Team"?

Jonadab the Unsightly One jonadab at bright.net
Tue Apr 5 05:05:09 PDT 2005


"Volker Berlin" <volker.berlin at goebel-clan.de> writes:

> I want help the "Open Clipart Libary" but I find the FAQ entry "How
> can I help?" not very helpful.

At this time, to help in any capacity other than submitting artwork,
it is really necessary to be on this mailing list.  Long term we want
to make it easier, but we haven't got there yet.  The good news is,
it's not a very high-traffic mailing list, at least not yet, so being
subscribed is not much of a burden :-)

> I want help you with adding keywords and translating (german). But I
> have not find a login or other options to do it.
>
> How can I add or modify a keyword?
> How can I help with translation?

We don't have a website-based system for those things in place yet.
We've discussed some of the things we want to implement in that
regard, but it remains to be done.

However, meanwhile, there are certainly things you can do.  For
instance, we have a basic idea how we want to handle i18n and l10n,
but what we are doing ad interim is a bit of a hack and is rather
anglo-centric, language-wise.  You may be able to help with that.

Ultimately, the way we want to handle localization of the collection
is with XML-based hierarchy description files that tell, for a given
language, what categories and subcategories there are, how they're
arranged, and what they contain.  For English, it might go something
like the (oversimplified) example below.  For German, the hierarchy
might be different, and of course the name attributes for each
category would be different, because you'd want German-language names
for the categories.  Also, we probably want to include a way to
specify keywords to _exclude_, and from what we've discussed earlier,
it seems German localization should exclude images with the "nazi"
keyword, for legal reasons.  (Currently, there is only one such image,
but using the keyword metadata will scale better than handling
exclusion on a per-image basis.)

We really should start working on this.  What we have right _now_ is a
messy hack named (for historical reasons) convert-release-to-browse.pl
that only handles English and doesn't do any exclusion (although there
is a separate tool for filtering).  What we want to have is a
hierarchy-en.xml and hierarchy-de.xml (and whatever other languages
can be added later), and a single script (probably clipart-localise)
that can be fed one of those and produce a localized version of the
collection.  The script I can write (although if someone else wants to
do it, that's fine too), and the English hierarchy we can adapt from
the collection as it stands now and the convert-release-to-browse
script, but in order for any of that to be of any real use, we need
another language hierarchy to test it with, and German is as good as
any.

Here a simplified example of how one should look:

<hierarchy language="en" unsorted="Miscellaneous">
  <category name="Transportation">
    <keyword>transportation</keyword>
    <keyword>transport</keyword>
    <category name="Vehicles">
       <keyword>vehicle</keyword>
       <keyword>car</keyword>
       <keyword>truck</keyword>
    </category>
  </category>
  <category name="Signs and Symbols">
    <keyword>sign</keyword>
    <keyword>symbol</keyword>
    <category name="Flags">
      <keyword>flag</keyword>
    </category>
  </category>
</hierarchy>

Anything that doesn't match one of the _other_ categories will go into
the Miscellaneous folder as specified by the unsorted attribute of the
hierarchy element (unless, of course, it is excluded).

The German hierarchy should work from the same keywords, since we want
all the localized versions to be based on the same stock collection,
which will make collection maintenance much easier.  Although, maybe
there should be a way to specify that the keywords, in the process of
localization, should be changed, e.g., 
<keyword localize="newspelling">englishspelling</keyword>

Also, we must try to avoid having keywords with more than one meaning,
when a word in English is used multiple ways, because that would make
localization hard.  If we do run into that, we'll have to split it
into different keywords, and the English localization process can
remerge them if desired.  (Corrollary:  we will not ultimately be able
to use the English-language release tarball as the basis for the next
release.  We can worry about that when it becomes a problem, though,
and after we have the localization tools basically working.)

You can get a pretty good idea what keywords the images have now from
looking at the convert-release-to-browse script, which is in the tools
package available on the downloads page.  (I think the version there
is, at this time, identical to the version in CVS.)

-- 
$;=sub{$/};@;=map{my($a,$b)=($_,$;);$;=sub{$a.$b->()}}
split//,"ten.thgirb\@badanoj$/ --";$\=$ ;-> ();print$/




More information about the clipart mailing list