[xliff-tools] Another question on PO and XLIFF
Asgeir Frimannsson
asgeirf at redhat.com
Mon May 2 18:13:35 PDT 2005
Hi Yves,
I agree with you in that nr 3 is the most logical, and that is what I'm
currently using our filters.
So the following example:
> msgid ""
> "Line 1\n line 2\n"
> "Line 3.\n"
(which would be identical to:
msgid "Line 1\n line 2\nLine3.\n"
)
becomes:
<source xml:space='preserve'>Line1
line2
Line3
</source>
How it is represented in the PO file is just cosmetics, more like how we
indent XML for the clarity of reading. Gettext does some sort of
'normalization' when extracting entries from sourcecode, as in the following
example:
gettext("This is line 1\n\
This is line 2\nThis is line 3\n");
This would in PO become:
msgid ""
"This is line 1\n"
"This is line 2\n"
"This is line 3\n"
msgstr ""
and not
msgid "This is line1\nThis is line 2\nThis is line 3\n"
msgstr ""
(allthough they are the same internal representation)
It would be interesting to see how this is solved in the Java .properties
guide, as it is basically the same problem:
string_1=This is line 1\nThis is line 2\n This is line 3\n
On using Windows newlines in this approach:
First of all, Gettext throws warnings ("xgettext: internationalized messages
should not contain the `\r' escape sequence") when extracting messages
with windows escape characters from source-code, and the gettext tools does
not handle '\r' or '\r\n' characters in the same way as '\n'.
But if you really want to use windows newline characters, I guess this
information could easily be stored in the extraction/merge process? Something
like:
<source xml:space='preserve' myns:lineending='win'>Line1
line2
Line3
</source>
(here assuming that '\r' and '\n' are not mixed within the same TU)
Going back to the three approaches you mentioned: I don't see a problem with
using the 1st approach either:
<source xml:space='preserve'>
Line 1\n line 2\n
Line 3.\n</source>
..But I don't see any benefits of using this approach, other than the
windows/unix line-ending issue. In fact, I think it's less garbage in the TM
if we remove the \n characters. For XLIFF editors, if a translator wants to
see the newline characters, he/she can always turn on the 'view formatting'
option, visually displaying newline characters and word-wrap lines etc..
Well, I didn't bring much new information into the discussion. For the guide,
I agree we should add a section on this - but should we reccomend only one
approach? (I'm in favour of reccomending nr 3, but I'm open to
discussions :) )
cheers,
asgeir
On Mon, 2 May 2005 22:54, Yves Savourel wrote:
> Hi,
>
> I have a new question on XLIFF representation of PO.
> The guide
> (http://xliff-tools.freedesktop.org/wiki/Projects_2fXliffPoGuideDraft2)
> does not seem to say anything about multi-line entries.
>
> msgid ""
> "Line 1\n line 2\n"
> "Line 3.\n"
>
> How this should be represented?
>
> <trans-unit xml:space="preserve">
>
> Line 1\n line 2\n
> Line 3.\n</trans-unit>
>
> Or
>
> <trans-unit xml:space="preserve">
> Line 1\n
> line 2\n
> Line 3.\n
> </trans-unit>
>
> Or
>
> <trans-unit xml:space="preserve">
> Line 1
> line 2
> Line 3.
> </trans-unit>
>
> ?
>
> The third seems more logical to me, but it could cause issues too, for
> example if the line-breaks are not \n but \r or \r\n (if the PO file is
> used for a non-Unix application) how would we know which type of line-break
> notation to put ack when merging.
>
> Anyhow, a section on this topic would be good to have in the Guide.
>
> Cheers,
> -yves
More information about the xliff-tools
mailing list