[Libreoffice-commits] dev-tools.git: lionss/application lionss/config.py lionss/_lionss lionss/lionss.py lionss/openshift.htaccess lionss/README.md lionss/static lionss/tpl

Mathias Michel matm at gmx.fr
Fri Jul 25 02:32:37 PDT 2014


 lionss/README.md                   |   53 +++++++++++++++++
 lionss/_lionss/gitter.py           |   85 ++++++++++++++++++++++++++++
 lionss/application                 |   21 +++++++
 lionss/config.py                   |   20 ++++++
 lionss/lionss.py                   |  110 +++++++++++++++++++++++++++++++++++++
 lionss/openshift.htaccess          |    2 
 lionss/static/README               |   12 ++++
 lionss/static/header.png           |binary
 lionss/static/libreoffice-logo.png |binary
 lionss/static/lionss.css           |   42 ++++++++++++++
 lionss/tpl/error.html              |    9 +++
 lionss/tpl/footer.html             |    8 ++
 lionss/tpl/header.html             |   21 +++++++
 lionss/tpl/index.html              |   19 ++++++
 lionss/tpl/result.html             |   30 ++++++++++
 15 files changed, 432 insertions(+)

New commits:
commit f23794f494e7fd020436a7992f6260f18c97e2bb
Author: Mathias Michel <matm at gmx.fr>
Date:   Sun Jul 6 00:10:08 2014 +0200

    fdo#39439: Add lionss, the LibreOffice Normative String Searcher
    
    python website to find places where a string is defined and used, providing opengrok links.
    
    Change-Id: I0fb8ace8217d7f5c9b29f598fbff71f0a1e05cb2
    Reviewed-on: https://gerrit.libreoffice.org/10096
    Reviewed-by: Caolán McNamara <caolanm at redhat.com>
    Tested-by: Caolán McNamara <caolanm at redhat.com>

diff --git a/lionss/README.md b/lionss/README.md
new file mode 100644
index 0000000..94ea1b9
--- /dev/null
+++ b/lionss/README.md
@@ -0,0 +1,53 @@
+Lionss
+======
+
+Introduction
+--------------
+The python webapp provides a web GUI to search for UI strings in LO code base, show all occurrences, and let you decide which one you want to search the references for in OpenGrok.
+
+OpenGrok has some issues which forced us to do this app. Else it would be able to cope with it.
+
+
+Notes on implementation
+-------------------------------
+### Choices
+
+We used Python 2.7, with `web.py` and `pylev` specific packages.
+
+We rely on a standard git repository. Due to architecture of .ui files and their references, we cannot use a bare repo now *(at least I don't know how. Well, it is handled by the code, but not supported anymore, actually)*.
+We rely on git being in the path.
+
+Strategy is we query for terms including all letters and same number of occurrences. Then we refine with levenshtein algorithm. So jokers are not allowed in search field. Once we found referenced text in .ui, we search for the same in the sources to provides all its uses, and link them to OpenGrok.
+
+### WebApp
+
+We kept the module layout although it is very small, because it is also a training for my Python skills
+
+#### Config
+
+The configuration file holds:
+
+* the git repo path
+* the OpenGrok LO base url for queries
+* the analysis config: file extensions, patterns for deciphering. It is held in a dict as we may want more items later (we had with [hs]rc + ui).
+
+### Script
+
+Not done since moving to .ui makes current work invalid. I will wait for validation of webapp before going into script.
+
+*Draft* : The python script does roughly the same workflow, but shows you file paths and lines so you can go through them in your shell.
+
+### Deployment
+
++ Bundled webserver of  `web.py` : smooth
++ Managed to configure Apache + mod_wsgi : some tricks, but that's Apache
++ Tried heroku, but lack of filesystem (was simple, though)
++ Tried OpenShift: has a small quota filesystem (1GB) for the free plan, but is a pain to configure
+  + A first level is almost useless, because wsgi expects either a ./wsgi.py or a /wsgi with some content.
+  + static files are expected in a specific place, so if you want to keep the framework struct, you need a `.htaccess` to redirect that.
+  + doesn't accept a module folder whose name is the same as base script.
+  + To keep in the 1GB allowed:
+    + `git clone -n --single-branch  git://gerrit.libreoffice.org/core lo_core  (~900MB out of 1GB)`
+    + `git config core.sparsecheckout true`
+    + `echo *.ui > .git/info/sparse-checkout`
+    + `git grep -l "" HEAD -- *.ui  | awk -F:  '{print $2}' | xargs git checkout HEAD --`
diff --git a/lionss/_lionss/__init__.py b/lionss/_lionss/__init__.py
new file mode 100644
index 0000000..e69de29
diff --git a/lionss/_lionss/gitter.py b/lionss/_lionss/gitter.py
new file mode 100644
index 0000000..6902f23
--- /dev/null
+++ b/lionss/_lionss/gitter.py
@@ -0,0 +1,85 @@
+#!/usr/bin/env python
+#
+# This file is part of the LibreOffice project.
+#
+# This Source Code Form is subject to the terms of the Mozilla Public
+# License, v. 2.0. If a copy of the MPL was not distributed with this
+# file, You can obtain one at http://mozilla.org/MPL/2.0/.
+#
+
+import subprocess
+import os
+import pylev # levenstein module
+
+class worker:
+    def __init__(self, needle, case, repo_path):
+        self.goal = needle
+        self.case = case
+        self.proposals = dict()
+        
+        if os.path.exists(os.path.join(repo_path, '.git')):
+            self.git_dir = '--git-dir=' + os.path.join(repo_path, '.git')
+            self.worktree = '--work-tree=' + repo_path
+        elif os.path.exists(os.path.join(repo_path, 'git')):
+            self.git_dir = '--git-dir=' + repo_path
+            self.worktree = ''
+        else:
+            raise Exception('git repo path not found. Repo must exist and be up-to-date !')
+
+
+    def start(self, gset):
+        self.ggsettings = gset
+        # rough pattern building: all chars of the word(s) + hotkey sign)
+        items = len(self.goal)
+        goalpat = ''.join(set(self.goal + self.ggsettings['hotkey']))
+        # add +1 for potential hotkey sign
+        pattern_counter = '{' + str(items) + ',' + str(items + 1) + '}'
+        fullpat = self.ggsettings['pattern_prefix'] + '[' + goalpat + ']' + pattern_counter
+
+        try:
+            gg_opt = '-EnI'
+            if not self.case: gg_opt += 'i'
+            
+            gg_matches = subprocess.check_output(
+                        ["git", self.git_dir, self.worktree] + 
+                            ['grep', gg_opt, fullpat.encode('ascii'), '--'] + 
+                            self.ggsettings['file_selectors'] + ['HEAD'],
+                        stderr=subprocess.STDOUT)
+
+        except subprocess.CalledProcessError as e:
+            if e.returncode == 1:  # git grep found nothing
+                return
+            else:
+                raise(e)
+        except:
+            raise
+            
+        line_matches = gg_matches.splitlines()
+        dbg = ""
+        for match in line_matches:
+            [fname, line, text] = match.split(':', 2)
+            goalmatch_real = text.split(self.ggsettings['text_splitter'][0])[1]\
+                        .split(self.ggsettings['text_splitter'][1])[0] 
+            if self.case: goalmatch = goalmatch_real
+            else: goalmatch = goalmatch_real.lower()
+            skip = False
+            for word in self.goal.split(' '):
+                if not self.case: word = word.lower()
+                if not word in goalmatch: skip = True
+            if skip: continue    
+
+            if goalmatch_real not in self.proposals:
+                self.proposals[goalmatch_real] = [[fname, line]]
+            else:
+                self.proposals[goalmatch_real] += [[fname, line]]
+        #~ return str([dbg,gg_matches]+["git", self.git_dir, self.worktree] +
+                   #~ ['grep', gg_opt, fullpat.encode('ascii'), '--'] +
+                   #~ self.ggsettings['file_selectors']);
+    
+    def apply_lev(self, threshold):
+        if self.proposals:
+            for value in self.proposals.keys():
+                if pylev.levenshtein(value, self.goal) > threshold:
+                    del self.proposals[value]
+    
+# EOF
diff --git a/lionss/application b/lionss/application
new file mode 100644
index 0000000..da46194
--- /dev/null
+++ b/lionss/application
@@ -0,0 +1,21 @@
+#!/usr/bin/python
+import os
+import sys
+
+sys.path.insert(0, os.path.dirname(__file__) or '.')
+
+PY_DIR = os.path.join(os.environ['OPENSHIFT_HOMEDIR'], "python")
+
+virtenv = PY_DIR + '/virtenv/'
+
+PY_CACHE = os.path.join(virtenv, 'lib', os.environ['OPENSHIFT_PYTHON_VERSION'], 'site-packages')
+
+os.environ['PYTHON_EGG_CACHE'] = os.path.join(PY_CACHE)
+virtualenv = os.path.join(virtenv, 'bin/activate_this.py')
+
+try:
+    exec(open(virtualenv).read(), dict(__file__=virtualenv))
+except IOError:
+    pass
+
+from lionss import application
diff --git a/lionss/config.py b/lionss/config.py
new file mode 100644
index 0000000..2cab507
--- /dev/null
+++ b/lionss/config.py
@@ -0,0 +1,20 @@
+#
+# This file is part of the LibreOffice project.
+#
+# This Source Code Form is subject to the terms of the Mozilla Public
+# License, v. 2.0. If a copy of the MPL was not distributed with this
+# file, You can obtain one at http://mozilla.org/MPL/2.0/.
+#
+import os
+
+# Variables for PyLoNS
+repo_localpath = '/var/www/git/core'
+# Openshift repo_localpath = os.environ['OPENSHIFT_DATA_DIR']+'lo_core'
+og_root = 'http://opengrok.libreoffice.org/search?project=core&q='
+#~ pattern_prefix, file_selectors, file_splitter
+gg_settings = [dict() for x in range(1)]
+gg_settings[0] = dict( pattern_prefix = '<property name="label" translatable="yes">',
+    hotkey = '_',
+    file_selectors = ['*.ui'], 
+    text_splitter = '><',
+    text_picker = 'fname' )
diff --git a/lionss/lionss.py b/lionss/lionss.py
new file mode 100644
index 0000000..6133f5f
--- /dev/null
+++ b/lionss/lionss.py
@@ -0,0 +1,110 @@
+#!/usr/bin/env python
+# -*- Mode: makefile-gmake; tab-width: 4; indent-tabs-mode: t -*-
+#
+# This file is part of the LibreOffice project.
+#
+# This Source Code Form is subject to the terms of the Mozilla Public
+# License, v. 2.0. If a copy of the MPL was not distributed with this
+# file, You can obtain one at http://mozilla.org/MPL/2.0/.
+#
+
+# LIbreOffice Normative-Strings Searcher
+import web
+from web import form
+import os
+import subprocess
+from subprocess import CalledProcessError
+import traceback
+import _lionss.gitter
+from config import *
+
+version = '0.7.1'
+urls = (
+    '/', 'index',
+    '/pick(.*)', 'pick'
+)
+
+render = web.template.render(os.path.join(os.path.dirname(__file__), 'tpl/'))
+
+searcher = form.Form(
+    form.Textbox('SString', form.notnull, description = 'Searched String'),
+    form.Textbox('lev', 
+                    form.regexp('\d+', 'Must be a figure between 1 (strict) and 100 (loose)'),
+                    form.Validator('Must be more than 0', lambda x:int(x)>0),
+                    form.Validator('Must be less than 101', lambda x:int(x)<=100),
+                    description = 'Strictness', size = "5", default = "0", value = "1" ),
+    form.Checkbox('case', description = 'Case-Sensitive', value='case', checked='true'),
+    form.Button('Try to find',type = "submit"),
+    ) 
+
+
+web.template.Template.globals['footerhtml'] = render.footer()
+
+class index:
+    def GET(self):
+        web.template.Template.globals['headerhtml'] = render.header(version, '')
+        ttf = searcher() # ttf = Try To Find :)
+        return render.index(ttf)
+
+    def POST(self):
+        web.template.Template.globals['url'] = web.ctx['realhome']
+        web.template.Template.globals['headerhtml'] = render.header(version, 'ERROR')
+        ttf = searcher()
+        if not ttf.validates():
+            return render.index(ttf)
+        dbgstr = ""
+
+        try:
+            finder = _lionss.gitter.worker(ttf.SString.value, 
+                                ttf.case.checked, repo_localpath)
+
+            # search for approximate values
+            dbg = finder.start( gg_settings[0])
+
+            # check for levenshtein test
+            finder.apply_lev(int(ttf.lev.value))
+            web.template.Template.globals['headerhtml'] = render.header(version, 'Search results')
+
+            # we will crash if there are empty proposals. Should only occur if 
+            # generic structure of file change (split of string inside grep result)
+            return render.result(finder.proposals, str(ttf.SString.value))
+            #~ return render.result(finder.proposals, str(dbg))
+
+        except CalledProcessError as e:
+            return render.error(str(e))
+        except Exception as e:
+            return render.error(traceback.format_exc()+"\n"+ dbgstr)
+
+
+class pick:
+    def GET(self, mangled):
+        ''' [http://127.0.0.1:8080/pick]/Smart%20Tag/sw/source/ui/smartmenu/stmenu.src/32 '''
+        ''' None needle filename line '''
+        
+        web.template.Template.globals['headerhtml'] = render.header(version, 'ERROR')
+        identity = mangled.split('/')
+        
+        if identity[0]:
+            return render.error('MALFORMED URL ::' + identity[0] + '::')
+            
+        filename = os.path.join(repo_localpath, '/'.join(identity[2:-1]))
+        line = int(identity[-1])
+        
+        resid = filename.split('uiconfig/')[1]
+        if isinstance(resid, (int, long)): # resid should be a string
+            return render.error('Stopped at '+str(resid))
+        if not resid:
+            return render.error('Resource ID not found for ' + identity[1])
+        grok_url = og_root + resid
+        raise web.seeother(grok_url)
+
+if __name__ == "__main__":
+	web.config.debug = True
+	app = web.application(urls, globals())
+    app.run()
+else:
+    web.config.debug = False
+    app = web.application(urls, globals(), autoreload=False)
+    application = app.wsgifunc()
+
+# vim: set noet sw=4 ts=4:
diff --git a/lionss/openshift.htaccess b/lionss/openshift.htaccess
new file mode 100644
index 0000000..98940e3
--- /dev/null
+++ b/lionss/openshift.htaccess
@@ -0,0 +1,2 @@
+RewriteEngine On
+RewriteRule ^application/static/(.+)$ /static/$1 [L]
diff --git a/lionss/static/README b/lionss/static/README
new file mode 100644
index 0000000..7ba0cef
--- /dev/null
+++ b/lionss/static/README
@@ -0,0 +1,12 @@
+Public, static content goes here.  Users can create rewrite rules to link to
+content in the static dir.  For example, django commonly uses /media/
+directories for static content.  For example in a .htaccess file in a
+wsgi/.htaccess location, developers could put:
+
+RewriteEngine On
+RewriteRule ^application/media/(.+)$ /static/media/$1 [L]
+
+Then copy the media/* content to yourapp/wsgi/static/media/ and it should
+just work.
+
+Note: The ^application/ part of the URI match is required.
diff --git a/lionss/static/header.png b/lionss/static/header.png
new file mode 100644
index 0000000..f60318d
Binary files /dev/null and b/lionss/static/header.png differ
diff --git a/lionss/static/libreoffice-logo.png b/lionss/static/libreoffice-logo.png
new file mode 100644
index 0000000..cb961b5
Binary files /dev/null and b/lionss/static/libreoffice-logo.png differ
diff --git a/lionss/static/lionss.css b/lionss/static/lionss.css
new file mode 100644
index 0000000..e17eb38
--- /dev/null
+++ b/lionss/static/lionss.css
@@ -0,0 +1,42 @@
+body {
+    font: normal normal 400 14px Sans-Serif;
+}
+
+#header { 
+    width : 935px; 
+    margin : auto;
+    border-bottom: 6px solid #18A303;
+    height: 88px;
+    text-align: center;
+}
+
+#header h1 { display : inline; float : left;}
+#tagline { 
+    vertical-align : center; 
+    padding-top: 44px; /* to align on logo */
+    font-family : Serif; 
+    font-size: 150%
+}
+#tagline em { color: #18A303; font-style:normal }
+#corpus { width : 935px; margin: auto ; min-height: 400px;}
+#index th {  text-align: right ; padding-right: 8px}
+#footer { font-size : smaller ; text-align : center; color: #18A303;}
+
+#results tr { 
+border-bottom: 1px solid #FFFFFF;
+border-top: 1px solid #FFFFFF;
+}
+
+#results tr td { padding : 3px}
+#results th { background-color: #43C330; color: #FFFFFF; }
+#results tr:nth-child(2n) { 
+background-color: #CCF4C6;
+}
+
+#results tr:nth-child(2n+3) { 
+background: #92E285;
+}
+
+.wrong { color: red;}
+
+#corpus a { color: #18A303; }
diff --git a/lionss/tpl/error.html b/lionss/tpl/error.html
new file mode 100644
index 0000000..e894ce4
--- /dev/null
+++ b/lionss/tpl/error.html
@@ -0,0 +1,9 @@
+$def with (res)
+
+$:headerhtml
+<h2>Error</h2>
+<pre>
+$res
+</pre>
+<p>Press back and try again. If reproducible, contact us below.</p>
+$:footerhtml
diff --git a/lionss/tpl/footer.html b/lionss/tpl/footer.html
new file mode 100644
index 0000000..134ff27
--- /dev/null
+++ b/lionss/tpl/footer.html
@@ -0,0 +1,8 @@
+<div id="footer">
+	If you need help or want to discuss, do not hesitate to join 
+	<a href="irc://chat.freenode.net/libreoffice-dev">#libreoffice-dev on freenode IRC</a>
+</div>
+
+</body>
+
+</html>
diff --git a/lionss/tpl/header.html b/lionss/tpl/header.html
new file mode 100644
index 0000000..9691f86
--- /dev/null
+++ b/lionss/tpl/header.html
@@ -0,0 +1,21 @@
+$def with (ver, subtitle)
+
+$code:
+    if subtitle:
+       subtitle = ' // ' + subtitle 
+
+<html>
+<head>
+	<title>LiONSS $ver $subtitle</title>
+	<link rel="stylesheet" type="text/css" href="/static/lionss.css" />
+</head>
+		
+<body>
+<div id="header">
+	<h1>
+		<a href="http://www.libreoffice.org" title="LibreOffice website">
+			<img src="/static/libreoffice-logo.png" alt="LibreOffice" />
+		</a>
+	</h1>
+	<p id="tagline">LiONSS: <em>Li</em>bre<em>O</em>ffice <em>N</em>ormative-<em>S</em>trings <em>S</em>earcher</p>
+</div>
diff --git a/lionss/tpl/index.html b/lionss/tpl/index.html
new file mode 100644
index 0000000..43704bf
--- /dev/null
+++ b/lionss/tpl/index.html
@@ -0,0 +1,19 @@
+$def with (more)
+
+$:headerhtml
+<div id="corpus">
+	<div>
+		<p>Welcome !</p>
+		
+		<p>So you want to search some translatable strings ? Let's go :</p>
+	</div>
+
+	<form method="post" >
+		$:more.render()
+	</form>
+	<p>
+	<small>Message about strictness : 1 = strict, 100 = very loose. If you search 
+	for substrings («Update links» instead of «Update links 
+	when opening»), make it very loose.</small></p>
+</div>
+$:footerhtml
diff --git a/lionss/tpl/result.html b/lionss/tpl/result.html
new file mode 100644
index 0000000..8926308
--- /dev/null
+++ b/lionss/tpl/result.html
@@ -0,0 +1,30 @@
+$def with (res_dict, hint)
+
+$:headerhtml
+<div id="corpus">
+<a href="/">Restart</a>
+<p>
+This is what I searched : <b>$hint</b> <br />
+But i need to tell you
+$if not res_dict:
+    <h2>No result found for $hint. Please click Restart and search again.</h2>
+$else:
+    I found this :
+    <table id="results">
+        <tr><th>Text</th><th>Locations</th></tr>
+    $for item in res_dict:
+        <tr><td>$item </td>
+        <td>
+        <ul>
+            $for loc in res_dict[item]:
+            <li>$loc[0] at line 
+            <a href="$url/pick/$hint/$loc[0]/$loc[1]" target="_blank">
+            $loc[1]</a>
+            </li>
+        </ul>
+        </td>
+        </tr>
+    </table>
+</p>
+</div>
+$:footerhtml


More information about the Libreoffice-commits mailing list