<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 TRANSITIONAL//EN">
<HTML>
<HEAD>
<META HTTP-EQUIV="Content-Type" CONTENT="text/html; CHARSET=UTF-8">
<META NAME="GENERATOR" CONTENT="GtkHTML/3.3.2">
</HEAD>
<BODY>
On Tue, 2005-05-03 at 13:36 +1000, Asgeir Frimannsson wrote:<BR>
<BR>
Hii,<BR>
<BR>
<BLOCKQUOTE TYPE=CITE>
<PRE>
<FONT COLOR="#000000">> The "\n" is part of the text. It is a sequence of two characters: '\'</FONT>
<FONT COLOR="#000000">> and 'n'. It is not only an instruction for the program that will</FONT>
<FONT COLOR="#000000">> display the text on screen. The translator should be able to see these</FONT>
<FONT COLOR="#000000">> characters and move them wherever they fit.</FONT>
<FONT COLOR="#000000">"\n" is a sequence of two characters, yes I agree so far. But it is still only </FONT>
<FONT COLOR="#000000">a representation of an escape-sequence. And this is also how they are </FONT>
<FONT COLOR="#000000">represented internaly in gettext. In addition, Gettext ignores totally how </FONT>
<FONT COLOR="#000000">the PO file is formatted (if it's on multiple lines, or a single line). Let's </FONT>
<FONT COLOR="#000000">do a simple test:</FONT>
</PRE>
</BLOCKQUOTE>
<PRE>
</PRE>
I see that you base everything on Gettext API. Isn't it too dangerous to assume that all files were originated in C programs? <BR>
<BR>
Some PO files are generated from XML documents. If at reverse conversion you add the sequence "\n" whenever you find a linefeed, the result will be a mess.<BR>
<BR>
<BLOCKQUOTE TYPE=CITE>
<PRE>
<FONT COLOR="#000000">Representing this in XLIFF by replacing THE TWO CHARACTERS '\' and 'n' with a </FONT>
<FONT COLOR="#000000">real newline character on conversion, and similarly replacing the real </FONT>
<FONT COLOR="#000000">newline character with "\n" on back-conversion would be a just as valid </FONT>
<FONT COLOR="#000000">approach.</FONT>
</PRE>
</BLOCKQUOTE>
<PRE>
</PRE>
What about PO files originated from PHP? Is it still correct to replace a real newline character with "\n" on back conversion? And what about Python? XML? Any format?<BR>
<BR>
<BR>
<BLOCKQUOTE TYPE=CITE>
<PRE>
<FONT COLOR="#000000">In fact, if I were to use your approach here, I would have to manually </FONT>
<FONT COLOR="#000000">replace all real newline characters with "\\n" before converting to XLIFF, as </FONT>
<FONT COLOR="#000000">the gettext API handles "\n" as real newline characters internally (and yes, </FONT>
<FONT COLOR="#000000">I'm using the gettext api for parsing/reading/writing PO files in my </FONT>
<FONT COLOR="#000000">filters).</FONT>
</PRE>
</BLOCKQUOTE>
<BR>
You don't have to convert real newlines to "\\n". Simply write a newline character in the <source> or <target> element. <BR>
<BR>
<BLOCKQUOTE TYPE=CITE>
<PRE>
<FONT COLOR="#000000">I don't want the XLIFF editor to display a '\n', i just want it to add a </FONT>
<FONT COLOR="#000000">newline character where there is a newline in the source, so:</FONT>
<FONT COLOR="#000000">msgid "hello \n world"</FONT>
<FONT COLOR="#000000">becomes </FONT>
<FONT COLOR="#000000"><source xml:space='preserve'>hello</FONT>
<FONT COLOR="#000000">world</source></FONT>
<FONT COLOR="#000000">and would display in a editor:</FONT>
<FONT COLOR="#000000">hello</FONT>
<FONT COLOR="#000000">world</FONT>
</PRE>
</BLOCKQUOTE>
<PRE>
</PRE>
As the attribute xml:space is set to "preserve", XLIFF editors display the text as you sketched above.<BR>
<BR>
BTW, it is better to set the xml:space attribute in the <trans-unit> element and let the scope rules cover the <target> and <alt-trans> children.<BR>
<BR>
<BLOCKQUOTE TYPE=CITE>
<PRE>
<FONT COLOR="#000000">...maybe with a nifty nice <enter> arrow after 'hello' if 'view formatting' is </FONT>
<FONT COLOR="#000000">turned on.</FONT>
</PRE>
</BLOCKQUOTE>
<BR>
Ahh, that's decoration. <BR>
<BR>
Rodolfo<BR>
<TABLE CELLSPACING="0" CELLPADDING="0" WIDTH="100%">
<TR>
<TD>
-- <BR>
Rodolfo M. Raya <<A HREF="mailto:rodolfo@heartsome.net">rodolfo@heartsome.net</A>><BR>
Heartsome Holdings Pte Ltd
</TD>
</TR>
</TABLE>
</BODY>
</HTML>