Project addition request - xliff-tools

Asgeir Frimannsson asgeirf at
Sun Nov 21 13:31:32 EET 2004

<rahulsundaram at> wrote:
> > Asgeir is our new honours research student who is
> > going to implement a set
> > of XLIFF (eXtensible Localization File Interchange
> > Format) tools that will
> > enable translators and developers in the desktop
> > environment to work with
> > or use this format.
> is there a comparison with .po format available?. I
> would specifically be interested in any advantages

Here are the main advantages:

[Container format]
XLIFF is a container format for localizable data from any file format
that supports the standard. This means that tool developers can
concentrate on developing tools for the XLIFF standard, and not for
each and every file format that needs localization. For Open source,
this means that we can implement filters for e.g. PO, Docbook,
.desktop files etc, and not loosing any important meta-data needed for
the localization process (as with converting everything to PO)

XLIFF allows for various types of metadata to be stored in documents.
While PO files allow only a limited amount of metadata (e.g. general
comments and references for each individual segment, various
header-specific metadata), XLIFF has an extensive range of possible
elements to specify for each translation segment as well as for the
XLIFF file as a whole. E.g. by using maxwidth and minwidth, size
constraints can be specified for strings as well as other measurable
elements such as images. Each segment and phase can have a translator
associated with it, thus providing direct support for workflows and
integrated version management. If PO files are used, this information
is only available in CVS logs.

[Abstraction of inline codes]
PO does not have support for abstracting inline codes such as html
tags in messages. This can be done in XLIFF, which is useful e.g. when
translating docbook files. In practice, this means that the
localization tool ensures that the tags are included in the
translation (tags are displayed in a user friendly way, e.g. as
images). XLIFF can specify if a tag can occur multiple times, if it
can be removed etc.

[Vendor and tool neutral]
XLIFF files can be translated in any tool that support the XLIFF
standard. The support in commercial localization tools is getting
better, and the format is supported by major vendors such as Novell,
Sun, IBM, Microsoft.

[Decoupling of localization technolgies]
Open source localization is currently based on the Gettext utilities,
and most localization tools in this domain are focused solely around
the PO file format. By building open source localization tools that
support XLIFF, software projects using technologies other than Gettext
for localization can be included in open source localization processes
without this affecting the tools or ways the translators work.

The PO format doesn't allow for workflows beyond marking entries as
fuzzy for later revision. XLIFF specifies elements and attributes for
workflow information, and localization can pass through multiple,
defined phases (e.g. pre-translation, rough translation, review), in
which changes are documented in the XLIFF sources. This can be
utilized in e.g. release planning and quality assurance processes in
open source projects. Workflows can also integrate translation memory
as explained below.

[Translation Memory Improvements]
XLIFF (with TMX) allow for the possibility of shared translation
memories in open source projects. At present this is hard to
accomplish because of the limited information stored in the PO format.
With XLIFF, multiple TM matches can be stored in the document,
eliminating the need for client side TM technologies. Sharing of TM is
important to open source projects since contributors are spread
geographically. Upon 'checking out' a file for translation, TM
suggestions can be automatically inserted. When a translator has
completed his/her work and returns the file to the repository, a TMX
document containing the approved translations can be automatically
processed and new pairs imported into the TM.

[XML based]
One advantage of XML over other file formats is the range of open and
free tools and technologies available to process this format. Many
facilities exist to define parsing and transformation tools for
specific formats. These can be used in the open source environment in
a wide range of contexts; for example presenting summaries and data
for the translation status pages of the various projects. By using
simple XSLT and other XML transformation languages, intuitive
summaries and reports can readily be generated from TMX and XLIFF
files. These technologies can also be used by the translation tool to
present user friendly reports at various stages of the localization
process; e.g. printable HTML documentation can be straightforwardly
rendered from XLIFF sources for comparison and proofreading.

[Localization of non-textual elements]
XLIFF is not limited to localization of textual content, but can also
handle binary data. This opens up a lot of possibilities for future
enhancements of the localization process. Gnome and KDE User Interface
dialogs are currently stored in XML formats and can be encapsulated in
XLIFF documents and localized using visual XLIFF tools, similar to the
processing of Windows Resource files in tools such as Alchemy
Catalyst. (This however, requires architectural changes to the way
localization is handled at runtime. Currently Gettext handles all
localization of strings, but localizing other elements requires some
redesign of how localization is handled by the system and is out of
scope for this project).

Currently PO is used for translating docbook content, and here the
main advantages of xliff comes in. By creating filters for file types
such as docbook, translators can use the same tools as for translating
software messages, without loosing meta-data as with converting
docbook to po.

Some references:

[1] XLIFF whitepaper

[2] IBM Developerworks articles on XLIFF


More information about the xdg mailing list