[xliff-tools] XLIFF usage

Tue Apr 19 05:08:00 PDT 2005

Hi Steve,

I think that your comments are right in many aspects. In any case, I do
agree that a format is not a silver bullet by itself but a given
implementation could well be. While I do agree that if we continually
super-size XLIFF, there is a risk that it becomes an ever growing moving
target very difficult to program for safely, it is also true that at
least there is some common DNA in the format that can be somehow
leveraged in any shop, in-house or commercial.

Once/if the implementations behind XLIFF are world-class and mature, the
user adoption of those implementations should do the rest. We've seen
XLIFF in localization projects in only one client with a nearly
automated system that whas handled with their own tool.  The advantge
for us as a service provider was that we had in nearly no time the
ability to provide additional terminology reference to the translator
work environment, so the learning curve for the translator on client's
terminology and fancies was much shorter than with the client tool
alone, which was used for the actual editing. So reverse-engineering the
format was a piece of cake compared to a less open format.

I think that the vision is that the chance that a greater stablity on
format can raise the bar of the features of the implementations to a new
level.

To some extent, an API is a superset of a file format, as on one form or
another it needs a permanent (non volatile) data representation.  A
brilliant API on top of that permanement format could be an example of
implementation that becomes a silver bullet, and if it is publicly and
freely available effectively, the API could replace the format itself
for most UI-oriented implementations, why not? Object-oriented C++
classes replaced the old structs of C in many implementations, the model
you propose is somehow similar to that, but still, you need to be able
to dump and read your data structures, why not do it in a human readable
format such as XLIFF?

Regards,

Josep.

-----Mensaje original-----
De: Stephen Holmes [mailto:stephen at onedotone.com] Enviado el: martes, 12
de abril de 2005 16:13
Para: Josep Condal
CC: XLIFF-Tools
Asunto: RE: [xliff-tools] XLIFF usage

Thanks Josep,

<< Apologies for not in-lining my response, the mail is getting quite
large!>> 

So would it be fair of me to distill this down to say that the thrust of
the XLIFF deployment in the context of this Freedesktop.org effort is to
push it as a native resource file format to replace/consolidate 1..n
existing formats?  In this specific domain to migrate from PO to XLIFF?

Is this sensible?  

What about XAML, XUL, Glade and others that have been specifically
crafted for this purpose?  Microsoft, for example, have gone to great
lengths to enhance XML markup for application specific purposes in their
Longhorn technologies (avalon etc).  This will impact Linux through the
mono technology.  I know from my own experience in localisation business
that even something like a "standard" Windows RC specification to have a
large number of interpretations - this is why we moved to binary
localisation because RC content is largely unmanageable - custom
resource formats etc (binary simplifies but doesn't remove all of the
complexities). 

If you think about it, from a tools perspective we already have stable
and proven parsers for most of the common formats in use today - these
formats are largely stable.  New formats are being intro'd all of the
time but couldn't one simply parse them into a common framework (based
on a standard API) rather than complicate the process with an XLIFF
interim?  I say this, because I simply can't see XLIFF adopt the
characteristics of a Impress Slide, a Macromedia SWF, or a streaming
audio clip - let alone media rich application markup languages.

In my opinion, native XLIFF is way too generalised and too heavy for
native application resourcing purposes.  At my company, we were
compelled (architecturally) to move to a variant that could be used  as
a native resource markup (something I'd describe as XliffLite), but even
this requires a transformer from Xliff to XliffLite and back within the
localisation cycle (not to mention the tailoring of the OASIS DTD - very
bad behaviour, I know!).

The problem, as I see it, is less about the file format and more about
the API that binds a given specification (getNextTransUnit etc).  I do
of course see the advantages of the "One format to rule them all", but
even within GNU/Linux we have .desktop, .po, .glade, .jpg, .svg, .png,
.resx (through mono) - all formats designed to perform a specific
purpose (sometimes legal, sometimes technological)

I fear that in treating XLIFF as a silver bullet that consolidates
resource formats that we will ultimately commit it to an untimely
demise. 

I would still love to hear about practical applications and case studies
involving XLIFF. I'm extremely keen to understand from a deployment
perspective how the transformation to XLIFF impacted on quality, speed,
flexibility, cost and dependency issues before and after the effort. 

Oh dear, I do sound terribly cynical don't I? It's not really
intentional!

On Tue, 2005-04-12 at 15:14 +0200, Josep Condal wrote:
> Hi Steve,
> 
> > Steve is fine (only my mum calls me Stephen!)
> 
> Sorry about my strange message, due to - hopefully temporary - mental 
> turbulences. For some reason I had arrived to the conclusion that 
> Steve and Stephen were two completely different names and all of a 
> sudden I had a strong urge to apologize. :)
> 
> >>>> START OF QUOTE
> So here's my dilemma.  We currently have tools  that extract 
> translatable content from whichever file formats we're  interested in.
> So, I take a PO for example, it gets parsed (LEX/YACC) and then 
> translated within a tools environment.  The complexities of the 
> meta-data schema are hidden (as it should be) and the parser also 
> comes with a generator to create the translated output format - end to

> end process.  The database format might proprietary or represented in 
> XLIFF but the interface to this data would be open.
> 
> I've understood your comments correctly, then in the "new world 
> order", I'd do this....
> 
> [ Develop ]-->Src PO-->XLIFF-->[ Translation ]--> Tgt PO-->[ Test ]
> 
> But in the current model it would be...
> 
>    [ Develop ]-->Src PO-->[ Translation ]-->Tgt PO-->[ Test ]
> 
> So now, I have to create a PO parser AND a forward and reversion XLIFF

> to PO transformer.
> 
> Seems like I have to do more work.  I do however see the benefit in 
> having, say, an Alchemy Catalyst dump it's TTK format to XLIFF so I 
> could use the same translated content with gTranslator/KBabel or 
> whatever other GUI tool made use of XLIFF repository information.
> 
> >>>> END OF QUOTE
> 
> The idea would be that XLIFF is the preferred representation of any 
> given home format, and the closer it appears to the real source (for 
> example if gettext generated XLIFF instead of PO and back), the most 
> sense it makes (as your diagram above reflects).
> 
> >>>> START OF QUOTE
> 
> Understood, absolutely, but isn't the value in XLIFF really to do with

> helping me move my translated assets (with their meta data) between 
> systems rather than create yet another file format?  The benefit for 
> me is a loss-less transfer when  tooling technologies become available

> that afford me a higher % and quality of leverage.
> 
> >>>> END OF QUOTE
> 
> XLIFF is per definition an exchange format, but in practical scenarios

> I personally see most of the opportunity into making the subject 
> format more stable as a starting point for development of tools and 
> therefore be able to adjust the chosen tool to the characteristics of
the project.
> 
> In an scenario where the roundtrip filters are essentially available 
> and the best ones prevail (with the help of representation guides as 
> official guidance), the tool developers can focus in the features of 
> their software to the user rather in solving the same problem of 
> interpreting the new format in town.
> 
> I see less opportunity, but still an option at Alchemy's choice, in 
> the Alchemy Catalyst scenario that you mention, because the TTK format

> is a little bit away from the home format.
> 
> In other words, an .RC file (home format) is willing to be translated 
> so XLIFF can help build a process behind that, while a TTK file is 
> probably willing to be translated with Catalyst so XLIFF could be not 
> a practical option here. Let's say it depends largely on Alchemy 
> willingness to do so, even if you decide to reverse-engineer the TTK 
> file and build a filter for it.
> 
> Also XLIFF allows for exchange. While I see a relatively high risk of 
> trouble by exchanging without extensive interoperability tests, what 
> is clear is that many quality-oriented features (LINT-like stuff but 
> linguistic, for example) can be built into into a more stable format 
> such as XLIFF.  This way, sometimes you buy, sometimes you make.
> 
> >>>>
> My view would be that a good tools environment abstracts this 
> complexity anyway.  I certainly wouldn't send raw XLIFF out because 
> I'd then have to add LINT checks to the inbound materials.
> >>>>
> 
> Yes, I meant actually that the opportunity of making the format more 
> stable, allows for interesting tools to be developed on it more easily

> than if the project implies complex home formats. Some of the tools in

> the market are very good at what they do and may fulfill all your
needs.
> (Or apparent needs as perceived by you. For example, I hadn't needed 
> the "concordance" function for a few years (before 1998 more or less) 
> because I had used always CAT tools without it and life continued 
> without trouble, but when I discovered it, I saw that the value it 
> brought was enormous and cannot imagine a single translation process 
> without it playing a important role.  That's why we developed ApSIC 
> Xbench, to be able to bring concordance to the system level rather 
> than at the application level).
> 
> Regards,
> 
> Josep.