[Clipart] Translating image metadata (was: Re: PR for 0.16)

Jonadab the Unsightly One jonadab at bright.net
Wed Aug 3 08:49:58 PDT 2005


Bryce Harrington <bryce at bryceharrington.com> writes:

> How do we track the user's language preference?  Is there a cookie
> or an http variable that indicates the preferred language?

I'm speaking here to the long term...

For logged in users, we could store their language preference (and
other preferences) in the database.

For not-logged-in users, I propose the following:
 1.  If the User-Agent string is provided and includes localization
     information, use that.
 2.  Otherwise, guess English.

The next question that comes up then is what to do if the language
that the user prefers is not available for a given page.  That *will*
happen...

> After finding an svg to download on the web, should that svg also have
> its metadata translated into that language?  Or would it be better to
> keep it only in english?  What are the pros/cons?

It's not possible to translate all metadata into all languages.  We do
not, for instance, translate French author names into English, and
there's no reasonable way to translate most English author names into
Chinese.  

It would be possible to translate some of the metadata, most notably
the titles, but it goes against the grain for me; we frequently do not
translate the names of famous pieces of artwork from their respective
original languages.  I think the metadata in the files themselves
should be kept intact.  This also eliminates the need to store
information about what language the metadata were in originallly,
among other things.

However, if the DMS can store translated titles and things, then we
could produce indeces based on the translated data, allow users to
search by the translated data, and so on, which seems useful.

The remaining question is whether to regenerate the filenames for each
language, based on the translated metadata.  Our l10n plans already
involve regenerating the directory structure for each language we
support, so regenerating the filenames as well would not be completely
out of line with that...  Then again, I'm not sure I want to deal with
the character set issues that could create.  One supposes only the
English release would be usable on platforms with no support for
Unicode filenames; I *think* those are all legacy platforms at this
point, though.

> For editing the translations of metadata, what would be a good way
> to handle that?  I could imagine creating a database of all strings,
> and allow users to submit translated versions of each string via the
> web; is that going to be easy enough for users?  Are there other
> ways this could be handled?

I am thinking that you want to limit the number of strings to be
translated for each image.  Perhaps just the title and description.  I
don't see a lot of point in translating author names; the license is
the same for all the images; the keywords exist mostly to inform
categorization.  The hierarchies of categories also have to be created
for each supported language, but that is a much smaller and easier
task than translating the metadata for all the images.

Of course, if a given string is not translated into a given target
language for a given image, it could default to the original,
untranslated string.

-- 
Open Clip Art Library:  Drawing Together
http://www.openclipart.org/




More information about the clipart mailing list