[xliff-tools] Another question on PO and XLIFF

Asgeir Frimannsson asgeirf at redhat.com
Mon May 2 18:13:35 PDT 2005


Hi Yves,

I agree with you in that nr 3 is the most logical, and that is what I'm 
currently using our filters.

So the following example:

> msgid ""
> "Line 1\n line 2\n"
> "Line 3.\n"

(which would be identical to:
msgid "Line 1\n line 2\nLine3.\n"
)

becomes:

<source xml:space='preserve'>Line1
 line2
Line3
</source>

How it is represented in the PO file is just cosmetics, more like how we
indent XML for the clarity of reading. Gettext does some sort of 
'normalization' when extracting entries from sourcecode, as in the following 
example:
 
gettext("This is line 1\n\
This is line 2\nThis is line 3\n");

This would in PO become:

msgid ""
"This is line 1\n"
"This is line 2\n"
"This is line 3\n"
msgstr ""

and not
msgid "This is line1\nThis is line 2\nThis is line 3\n"
msgstr ""

(allthough they are the same internal representation)

It would be interesting to see how this is solved in the Java .properties 
guide, as it is basically the same problem:
string_1=This is line 1\nThis is line 2\n This is line 3\n

On using Windows newlines in this approach:

First of all, Gettext throws warnings ("xgettext: internationalized messages 
should not contain the `\r' escape sequence") when extracting messages
with windows escape characters from source-code, and the gettext tools does 
not handle '\r' or '\r\n' characters in the same way as '\n'. 

But if you really want to use windows newline characters, I guess this 
information could easily be stored in the extraction/merge process? Something 
like:
<source xml:space='preserve' myns:lineending='win'>Line1
 line2
Line3
</source>

(here assuming that '\r' and '\n' are not mixed within the same TU)

Going back to the three approaches you mentioned: I don't see a problem with 
using the 1st approach either:
<source xml:space='preserve'>
Line 1\n line 2\n
Line 3.\n</source>

..But I don't see any benefits of using this approach, other than the 
windows/unix line-ending issue. In fact, I think it's less garbage in the TM 
if we remove the \n characters. For XLIFF editors, if a translator wants to 
see the newline characters, he/she can always turn on the 'view formatting' 
option, visually displaying newline characters and word-wrap lines etc..

Well, I didn't bring much new information into the discussion. For the guide, 
I agree we should add a section on this - but should we reccomend only one 
approach? (I'm in favour of reccomending nr 3, but I'm open to 
discussions :) )

cheers,
asgeir

On Mon, 2 May 2005 22:54, Yves Savourel wrote:
> Hi,
>
> I have a new question on XLIFF representation of PO.
> The guide
> (http://xliff-tools.freedesktop.org/wiki/Projects_2fXliffPoGuideDraft2)
> does not seem to say anything about multi-line entries.
>
> msgid ""
> "Line 1\n line 2\n"
> "Line 3.\n"
>
> How this should be represented?
>
> <trans-unit xml:space="preserve">
>
> Line 1\n line 2\n
> Line 3.\n</trans-unit>
>
> Or
>
> <trans-unit xml:space="preserve">
> Line 1\n
>  line 2\n
> Line 3.\n
> </trans-unit>
>
> Or
>
> <trans-unit xml:space="preserve">
> Line 1
>  line 2
> Line 3.
> </trans-unit>
>
> ?
>
> The third seems more logical to me, but it could cause issues too, for
> example if the line-breaks are not \n but \r or \r\n (if the PO file is
> used for a non-Unix application) how would we know which type of line-break
> notation to put ack when merging.
>
> Anyhow, a section on this topic would be good to have in the Guide.
>
> Cheers,
> -yves


More information about the xliff-tools mailing list