[Libreoffice] [PATCH] speed up localized builds by introducing po2lo

Miklos Vajna vmiklos at frugalware.org
Fri Sep 9 17:36:17 PDT 2011


On Fri, Sep 09, 2011 at 10:40:54AM +0200, Andras Timar <timar74 at gmail.com> wrote:
> There are 2 minor issues.
> 1. When the English string contains \n and translation does not
> contain \n, the script outputs the English string instead of the
> translation. See for example this line:
> basctl	source\basicide\basidesh.src	0	string	RID_STR_SOURCETOBIG				15504	sd	The
> source text is too large and can be neither compiled nor
> saved.\nDelete some of the comments or transfer some methods into
> another module.
> 
> 2. Your script produces sdf lines even when the corresponding po file
> does not exist at all. For example many languages do not have help
> translations at all, but your script copies English heIp lines to
> localized sdf files in this case.

I'm attaching newer versions of both patches:

- the translations one is fixed wrt python (see
  82f6c0502e51afbc25e5bf0fcee7914a1a5b3f28, the patch had the same
  problem)
- the core one should be fixed wrt the above two issues

  [ The newline issue was a bug in the po parser, handling multiline
  msgid/msgstr entires, the second problem is simply fixed by checking
  if the relevant po file exists or not. ]

This time I tested it with not only "hu", but "af" and "sd" locales as
well.

Miklos
-------------- next part --------------
From 60173c8ed5180067773b7df503300f16752235c2 Mon Sep 17 00:00:00 2001
From: Miklos Vajna <vmiklos at frugalware.org>
Date: Wed, 7 Sep 2011 23:39:15 +0200
Subject: [PATCH] Add po2lo tool

---
 solenv/bin/po2lo |  205 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 files changed, 205 insertions(+), 0 deletions(-)
 create mode 100755 solenv/bin/po2lo

diff --git a/solenv/bin/po2lo b/solenv/bin/po2lo
new file mode 100755
index 0000000..0f81ebc
--- /dev/null
+++ b/solenv/bin/po2lo
@@ -0,0 +1,205 @@
+#!/usr/bin/env python
+# Version: MPL 1.1 / GPLv3+ / LGPLv3+
+#
+# The contents of this file are subject to the Mozilla Public License Version
+# 1.1 (the "License"); you may not use this file except in compliance with
+# the License or as specified alternatively below. You may obtain a copy of
+# the License at http://www.mozilla.org/MPL/
+#
+# Software distributed under the License is distributed on an "AS IS" basis,
+# WITHOUT WARRANTY OF ANY KIND, either express or implied. See the License
+# for the specific language governing rights and limitations under the
+# License.
+#
+# The Initial Developer of the Original Code is
+#       Miklos Vajna <vmiklos at frugalware.org>
+# Portions created by the Initial Developer are Copyright (C) 2011 the
+# Initial Developer. All Rights Reserved.
+#
+# Major Contributor(s):
+#
+# For minor contributions see the git repository.
+#
+# Alternatively, the contents of this file may be used under the terms of
+# either the GNU General Public License Version 3 or later (the "GPLv3+"), or
+# the GNU Lesser General Public License Version 3 or later (the "LGPLv3+"),
+# in which case the provisions of the GPLv3+ or the LGPLv3+ are applicable
+# instead of those above.
+
+import getopt, sys, os, re
+
+class Options:
+    """Options of this script."""
+
+    def __init__(self):
+        self.input = None
+        self.output = None
+        self.language = None
+        self.template = None
+
+class Entry:
+    """Represents a single line in an SDF file."""
+
+    def __init__(self, items):
+        self.items = items # list of 15 fields
+        path = self.items[1].split('\\')
+        self.po = "%s/%s/%s.po" % (options.input.replace('\\', '/'), self.items[0], "/".join(path[:-1]))
+        prefix = ""
+        if len(self.items[5]):
+            prefix += "%s." % self.items[5]
+        if len(self.items[3]):
+            prefix += "%s." % self.items[3]
+        self.keys = []
+        # 10..13 are translation types
+        for idx in range(10, 14):
+            if len(self.items[idx]):
+                t = {10:'text', 12:'quickhelptext', 13:'title'}[idx]
+                self.keys.append((idx, self.sdf2po("%s#%s.%s%s" % (path[-1], self.items[4], prefix, t))))
+
+    def translate(self, translations):
+        """Translates text in the entry based on translations."""
+
+        self.items[9] = options.language
+        for idx, key in self.keys:
+            try:
+                self.items[idx] = translations.data[(self.po, key)]
+
+                self.items[14] = "2002-02-02 02:02:02"
+            except KeyError:
+                pass
+        self.items[14] = self.items[14].strip()
+
+    def sdf2po(self, s):
+        """Escapes special chars in po key names."""
+
+        return s.translate(normalizetable)
+
+class Template:
+    """Represents a reference template in SDF format."""
+
+    def __init__(self, path):
+        sock = open(path)
+        self.lines = []
+        for line in sock:
+            entry = Entry(line.split('\t'))
+            if os.path.exists(entry.po):
+                self.lines.append(entry)
+
+    def translate(self, translations):
+        """Translates entires in the template based on translations."""
+
+        sock = open(options.output, "w")
+        for line in self.lines:
+            line.translate(translations)
+            sock.write("\t".join(line.items)+"\r\n")
+        sock.close()
+
+class Translations:
+    """Represents a set of .po files, containing translations."""
+
+    def __init__(self):
+        key = None
+        self.data = {}
+        for root, dirs, files in os.walk(options.input):
+            for file in files:
+                path = "%s/%s" % (root, file)
+                sock = open(path)
+                buf = []
+                multiline = False
+                fuzzy = False
+                for line in sock:
+                    if line.startswith("#: "):
+                        key = line.strip()[3:]
+                    elif line.startswith("#, fuzzy"):
+                        fuzzy = True
+                    elif line.startswith("msgstr "):
+                        trans = line.strip()[8:-1]
+                        if len(trans):
+                            if fuzzy:
+                                fuzzy = False
+                            else:
+                                self.setdata(path, key, trans)
+                                multiline = False
+                        else:
+                            buf = []
+                            buf.append(trans)
+                            multiline = True
+                    elif multiline and line.startswith('"'):
+                        buf.append(line.strip()[1:-1])
+                    elif multiline and not len(line.strip()) and len("".join(buf)):
+                        if fuzzy:
+                            fuzzy = False
+                        else:
+                            self.setdata(path, key, "".join(buf))
+                        buf = []
+                        multiline = False
+                if multiline and len("".join(buf)) and not fuzzy:
+                    self.setdata(path, key, "".join(buf))
+
+    def setdata(self, path, key, s):
+        """Sets the translation for a given path and key, handling (un)escaping
+        as well."""
+        if key:
+            # unescape the po special chars
+            s = s.replace('\\"', '"')
+            if key.split('#')[0].endswith(".xhp"):
+                s = self.escape_help_text(s)
+            else:
+                s = s.replace('\\\\', '\\')
+            self.data[(path.replace('\\', '/'), key)] = s
+
+    def escape_help_text(self, text):
+        """Escapes the help text as it would be in an SDF file."""
+
+        for tag in helptagre.findall(text):
+            # <, >, " are only escaped in <[[:lower:]]> tags. Some HTML tags make it in in
+            # lowercase so those are dealt with. Some LibreOffice help tags are not
+            # escaped.
+            escapethistag = False
+            for escape_tag in ["ahelp", "link", "item", "emph", "defaultinline", "switchinline", "caseinline", "variable", "bookmark_value", "image", "embedvar", "alt"]:
+                if tag.startswith("<%s" % escape_tag) or tag == "</%s>" % escape_tag:
+                    escapethistag = True
+            if tag in ["<br/>", "<help-id-missing/>"]:
+                escapethistag = True
+            if escapethistag:
+                escaped_tag = ("\\<" + tag[1:-1] + "\\>").replace('"', '\\"')
+                text = text.replace(tag, escaped_tag)
+        return text
+
+def main():
+    """Main function of this script."""
+
+    opts, args = getopt.getopt(sys.argv[1:], "si:o:l:t:", ["skipsource", "input=", "output=", "language=", "template="])
+    for opt, arg in opts:
+        if opt in ("-s", "--skipsource"):
+            pass
+        elif opt in ("-i", "--input"):
+            options.input = arg.strip('/')
+        elif opt in ("-o", "--output"):
+            options.output = arg
+        elif opt in ("-l", "--language"):
+            options.language = arg
+        elif opt in ("-t", "--template"):
+            options.template = arg
+    template = Template(options.template)
+    translations = Translations()
+    template.translate(translations)
+
+# used by ecape_help_text
+helptagre = re.compile('''<[/]??[a-z_\-]+?(?:| +[a-z]+?=".*?") *[/]??>''')
+
+options = Options()
+
+# used by sdf2po()
+normalfilenamechars = "/#.0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ"
+normalizetable = ""
+for i in map(chr, range(256)):
+    if i in normalfilenamechars:
+        normalizetable += i
+    else:
+        normalizetable += "_"
+
+if __name__ == "__main__":
+    main()
+
+# vim:set filetype=python shiftwidth=4 softtabstop=4 expandtab:
-- 
1.7.6

-------------- next part --------------
From 862f7858c2c21d3eda474ee8083daba5ebb8b3e9 Mon Sep 17 00:00:00 2001
From: Miklos Vajna <vmiklos at frugalware.org>
Date: Wed, 7 Sep 2011 23:39:18 +0200
Subject: [PATCH] Add po2lo tool

---
 translations/makefile.mk |   10 +++++++---
 1 files changed, 7 insertions(+), 3 deletions(-)

diff --git a/translations/makefile.mk b/translations/makefile.mk
index 1a858c3..d991f49 100644
--- a/translations/makefile.mk
+++ b/translations/makefile.mk
@@ -48,15 +48,19 @@ TARGET=translations_merge
 
 .INCLUDE : target.mk
 
+.IF "$(OS_FOR_BUILD)"=="WNT" && "$(SYSTEM_PYTHON)"!="YES"
+PYTHONCMD=$(AUGMENT_LIBRARY_PATH) $(WRAPCMD) $(SOLARBINDIR)/python
+.ELSE
+PYTHONCMD=$(WRAPCMD) $(PYTHON)
+.ENDIF
+
 .IF "$(SYSTEM_TRANSLATE_TOOLKIT)" == "YES"
 
 OO2PO=oo2po
-PO2OO=po2oo
 
 .ELSE                   # "$(SYSTEM_TRANSLATE_TOOLKIT)" == "YES"
 
 OO2PO=$(AUGMENT_LIBRARY_PATH) $(WRAPCMD) $(SOLARBINDIR)/oo2po
-PO2OO=$(AUGMENT_LIBRARY_PATH) $(WRAPCMD) $(SOLARBINDIR)/po2oo
 
 TRANSLATE_TOOLKIT_PYTHONPATH=$(SOLARLIBDIR)$/translate_toolkit
 .IF "$(SYSTEM_PYTHON)" == "YES" || "$(OS)" == "MACOSX"
@@ -94,7 +98,7 @@ $(MISC)/sdf-l10n/%.sdf : $(MISC)/sdf-template/en-US.sdf
     sed -e "s/\ten-US\t/\tkid\t/" < $@.tmp > $@
     rm -f $@.tmp
 .ELSE
-    $(PO2OO) --skipsource -i $(PRJ)/source/$(@:b) -t $(MISC)/sdf-template/en-US.sdf -o $@ -l $(@:b)
+    $(PYTHONCMD) $(SOLARSRC)/solenv/bin/po2lo --skipsource -i $(PRJ)/source/$(@:b) -t $(MISC)/sdf-template/en-US.sdf -o $@ -l $(@:b)
 .ENDIF
 
 $(MISC)/merge.done : $(foreach,i,$(all_languages) $(MISC)/sdf-l10n/$i.sdf)
-- 
1.7.6

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: not available
URL: <http://lists.freedesktop.org/archives/libreoffice/attachments/20110910/77df6732/attachment.pgp>


More information about the LibreOffice mailing list