[Clipart] organize repository and access rights

Jonadab the Unsightly One jonadab at bright.net
Sat Jul 17 07:40:29 PDT 2004


Jon Phillips <jon at rejon.org> writes:

> Yes, the babelfish solution is temporary till translation technology
> improves. However, if someone wants to maintain a translated version
> of the site, then lets consider options.
>
> I'm more concerned about internationalization of metadata.  At
> current all metadata is in English, but the problems of having
> internationalized metadata is the possibility of ultra-massive
> files. At the same time, how can we not bias against English
> language, or is this even a concern?

I don't think most of the metadata need to be translated.  Certainly,
you're not going to translate the author's name, are you?  With UTF8,
we shouldn't even need to transliterate it.  There's no need to
translate the license info either; it's mostly URIs anyway, and in any
case it should be the *same* for every image.

The keyword metadata *on the images* don't need to be translated
either, because the keywords are just magic tokens.  The *heirarchy*
will have to be translated, the human-readable tree of categories, but
that doesn't go on the images; we should only need to keep one copy of
each translation of it...

<heirarchy language="en"
     xmlns="http://openclipart.org/some/magic/path/heirarchy-0.1.dtd">
  <miscellany name="Miscellaneous" icon="misc.png" />
  <category name="Food and Drink" keyword="food" icon="yummies.png">
     <category name="Beverages" keyword="beverage" icon="drink.png" />
     <category name="Fruits and Vegitables" icon="freshies.png">
        <category name="Fruit" keyword="fruit" icon="fruitbasket.png" />
        <category name="Berries" keyword="berry" icon="berries.png" />
        <category name="Vegitables" keyword="vegitable" icon="rabbitfood.png" />
     </category>
     <category name="Prepared Foods" keyword="fooddish" icon="delifood.png">
        <category name="Sandwiches" keyword="sandwich" icon="reuben.png"/>
        <category name="Main Dishes" keyword="maindish" icon="casserole02.png" />
        <category name="Apetizers" keyword="apetizer" icon="horsdoeuvres.png" />
     </category>
  </category>
  <category name="Buildings" keyword="building" icon="bldg3.png">
     <category name="Homes" keyword="house" icon="house01.png" />
  </category>
  <!-- and so on and so forth -->
</heirarchy>

Any image that doesn't have the keyword for any of the categories
would go into the miscellany pseudocategory.

For l10n, you change out all the name attributes, the miscellany
element, the language attribute on the heirarchy element, and maybe
rearrange what's a subcategory of what for cultural reasons.  If
desired you can also change the icon representing the category -- for
example, for English the fruits category might have an icon of a
basket with apples and oranges and bananas in it, but for some
languages you might rather have an icon showing mangoes and breadfruit
and pomela, or whatever.

Since there seem to be a lot of English-speaking people on the list,
we'll probably do the English heirarchy first, but once we have the
keywords in place, the heirarchy is not really hard.  It wouldn't even
be strictly necessary for the translator to know all the English words
he's translating, since he can look at the images and figure out what
to name the category.

The only image metadata I can think of that someone might want to
translate are the title and the subject, but for most purposes even
that should be unnecessary, since most users will probably be just
browsing through the categories looking at thumbnails.

> For example, in the yield sign example if there were 10 language
> translations of the original English metadata, wouldn't that
> increase the file size abnormally?

Yes, but that's not necessary.  Most of the metadata are
organizational tools to help us manage the collection; the end user
doesn't need them.  (Also, the yield sign might be a bad example; I
don't know whether that same sign is used outside of the
English-speaking world.)

-- 
$;=sub{$/};@;=map{my($a,$b)=($_,$;);$;=sub{$a.$b->()}}
split//,"ten.thgirb\@badanoj$/ --";$\=$ ;-> ();print$/




More information about the clipart mailing list