[xliff-tools] XLIFF representation
Asgeir Frimannsson
asgeirf at redhat.com
Tue Apr 26 15:35:46 PDT 2005
Hi Yves,
Yves Savourel wrote:
> Hi,
>
> I was looking at the XLIFF PO Guide Draft 2
> (_http://xliff-tools.freedesktop.org/wiki/Projects_2fXliffPoGuideDraft2_)
> which is, I think, the latest draft I can access, and I had a question:
>
> I noticed that <trans-unit> have an id but no resname. It seems that
> it would be reasonnable for a software file format to have unique ID,
> and 'msgid' seems to be capable of doing this. I realize that msgid is
> really used for the source text, and that leads to make it in pratice
> not really usable for resname. Many localization tools rely on ID to
> do things like leveraging, updates, or alignment. It would be nice to
> have a solution for resname. (One cannot use id as it's just a
> sequential number).
>
> I guess my question goes a little further and touched on the usage of
> msgid itself. Wouldn't be more efficent from a localization viewpoint
> to recommend using unique IDs there instead of the source string? That
> would also follow the concept of treating the source language as "just
> another language".
>
The problem with radically changing Gettext (or rather how you use
gettext) is that we're changing the way (ten-)thousands of developers
work. Developers want minimal effort with implementing localisation
support, hence all they really need to do at preesent is change strings
from "hello world" to _("hello world"). This approach is favourable because:
1) It's easier to read through code as you have the original string
messages and not some more or less cryptic string ID.
2) No external resource files are needed to run the application in its
original language (sadly by GNU standards American English, - Should
have been Norwegian)
3) No tool-support is needed to manage string table ids.
The main disadvantages are:
1) No way of having same message with different contexts within the same
gettext domain (not without using 'hacks' anyway)
2) As you say, no way of really uniquely identifying a translation unit
( especially hard when changing spelling mistakes etc in the original
string - as you need fuzzy matching to identify the old string in the
string table)
3) Developers are locked in to using American English (or at least a
Germanic language - as Gettext natively only supports Germanic plural
forms).
What could be done is to use a hash of the orignal string as the resname
attribute in XLIFF, and in this way uniquely identifying the string
within the file (as gettext can't have two identical strings within the
same domain).
As Rodolfo mentioned, we're not aiming at changing Gettext - or the way
developers use gettext. But what's really interesting here is that when
we eventually start using XLIFF in favour of PO, we have eliminated the
dependency on Gettext in the development/localisation process. Hence, we
can then start customizing the way gettext works - or even use other
toolkits like ICU, without breaking anything in the localisation process
(keeping translators happy).
> Just a thought.
>
Really appreciate your input Yves!
cheers,
asgeir
More information about the xliff-tools
mailing list