[xliff-tools] Another question on PO and XLIFF

Asgeir Frimannsson asgeirf at redhat.com
Wed May 4 16:09:16 PDT 2005


Hi Karl,

(Welcome to the discussion bytheway :) )

On Thu, 5 May 2005 00:04, Karl Eichwalder wrote:
> Asgeir Frimannsson <asgeirf at redhat.com> writes:
> > /* xgettext:no-wrap */
> > printf( _("\
> > this is a very long long line with more than 80 characters\n\
> > this is another line\n" ) );
> >
> > becomes
> > msgid ""
> > "this is a very long long line with more than 80 characters\n"
> > "this is another line\n"
> >
> > instead of
> > msgid ""
> > "this is a very long long line with more "
> > "than 80 characters\n"
> > "this is another line\n"
>
> I always thought, no-wrap in .c source files signals line-breaks which
> are to be kept; thus such a no-wrap is mostly meant as a translator
> comment (of course, xgettext must respect it as well).  Both your .po
> file fragments are equivalent.
>
> The confusion arises because the command line tools offer a switch with
> the same name (--no-wrap) and this switch works as you explained it.

If you do some testing with using /* xgettext: no-wrap */, you'll see that 
they always produce the same result, only different output layout. ..And 
adding the --no-wrap command line flag when using e.g. xgettext is the same 
as putting the no-wrap comment in all translation units - it only affects the 
layout of the PO file, not the actual content.

You say, no-wrap "signals line-breaks which are to be kept" - does this mean 
it would do anything with the "\n" characters - or do you mean that this is 
two lines?:
msgid "this is line 1"
"this is line2"

And how would a translator use no-wrap?

> That's how things look to me and I hope, it is clear what I want to say
> ;)

I'm happy :)

> > From the manual (describing no-wrap): "Do not break long message lines.
> > Message lines whose width exceeds the output page width will not be split
> > into several lines."
>
> That's the --no-wrap switch--the other no-wrap usage does not seem to
> be documented at all; but it is used here (gettext source code):
>
> #: src/hostname.c:182 src/msgattrib.c:311 src/msgcat.c:263 src/msgcmp.c:140
> #: src/msgcomm.c:260 src/msgconv.c:217 src/msgen.c:203 src/msgexec.c:177
> #: src/msgfilter.c:270 src/msgfmt.c:361 src/msggrep.c:314 src/msginit.c:266
> #: src/msgmerge.c:297 src/msgunfmt.c:246 src/msguniq.c:239 src/urlget.c:134
> #: src/xgettext.c:498
> #, c-format, no-wrap
> msgid ""
> "Copyright (C) %s Free Software Foundation, Inc.\n"
> "This is free software; see the source for copying conditions.  There is
> NO\n" "warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR
> PURPOSE.\n"

... and this flag is simply added to the source code by adding the comment 
before the gettext call :

/* xgettext: no-wrap */
 printf (_("Copyright (C) %s Free Software Foundation, Inc.\n\
This is free software; see the source for copying conditions.  There is NO\n\
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.\n\
"),

to prevent a msgid like:

msgid ""
"Copyright (C) %s Free Software Foundation, Inc.\n"
"This is free software; see the source for copying conditions."
" There is NO\n"
"warranty; not even for MERCHANTABILITY or FITNESS FOR "
"A PARTICULAR PURPOSE.\n"

i.e it makes it easier for translators to see the intent of the developer in 
the no-wrapped tu - it simply won't break lines without "\n" into multiple 
lines (but it would however concatenate short lines without "\n").  

> I just wanted to point out that besides c-format another flag exists
> that should be honored somehow.  Especially, if you convert .xml to .po,
> you should add the "no-wrap" flag if appropriate.  Of course, it is not
> that important if you .xml straight to the .xliff format.

Yeah, that is an issue with replacing '\' + 'n' with newline in XLIFF - but 
only a cosmetic issue affecting the layout of the PO file - and not the 
intended Translation Unit. But you have a good point when dealing with XML 
files. Current xml/sgml-po filters does not honour the format of PO - in that 
they treat:

mgsid "<line1>text</line1>"
" <line2>text</line2>"

as meaning "<line1>text</line1>\n <line2>text</line2>" when they should be 
writing

mgsid "<line1>text</line1>\n"
" <line2>text</line2>"

to produce that result. - And this (intended) 'design-flaw' makes it much more 
convenient to translate XML files in tools such as KBabel - and that's 
perfectly fine.

How can we honour no-wrap in XLIFF? First of all, the most advanced PO tools 
doesn't honour it (e.g. kbabel word-wrap text instead of providing a 
scrollbar [or a visual word-wrap marker] to see the whole line), so should 
we? Say we should: Doesn't xml:space='preserve' provide a way for XLIFF 
editors to indicate to the users that lines are not word-wrapped? The editor 
can happily word-wrap source/target (or consume extra whitespace) when 
xml:space is not preserve (good for xml based formats), but honour whitespace 
otherwise.

In addition, to honour the original intent of 'no-wrap', the filter can 
provide the same flag when back-converting, using the same rules as gettext.

cheers,
asgeir


More information about the xliff-tools mailing list