[Grammar checker] Undocumented change in the API for LO 4
Olivier R.
olivier.noreply at gmail.com
Sat Feb 23 13:19:12 PST 2013
Hello Caolan,
Caolán McNamara wrote
> No idea, IMO it would be worth bibisecting this to find where it changed
> to re-examine if it was intentional or not. Would then likely be a good
> candidate for a unit test to lock in whichever is the right behaviour.
If someone did this intentionally he should say it. ;)
As a Windows user I can’t bibisect, but I investigated. (I’ll try to
bibisect when I have the time.)
doProofreading is called only in the file
core/linguistic/source/gciterator.cxx
It’s used to call the grammar checker for a given language.
Few changes last months:
http://cgit.freedesktop.org/libreoffice/core/log/linguistic/source/gciterator.cxx
For testing purpose, I change the file Lightproof.py in LibreOffice
4.0\share\extensions\dict-en
I added the line:
sys.stdout = open("D:\_lightproof_stdout.txt", "w")
before the class Lightproof.
And I added the lines:
print("PARAGRAPH: ", rText)
print(nStartOfSentencePos, nSuggestedSentenceEndPos,
rText[nStartOfSentencePos:nSuggestedSentenceEndPos])
sys.stdout.flush()
at the beginning of the method doProofreading.
(See the code at the end of this mail.)
Then I opened LibreOffice and I copy/paste a paragraph in document whose
language is set to English (USA).
The paragraph is:
Called up for active service in 1939, Jackson served with No. 23 Squadron in
Australia before he was posted to the Middle East in November 1940. A
charity hockey game held to benefit former player Bill Heindl, Jr. in 1980
was the only occasion that hockey legends Bobby Orr and Wayne. The total
floor area of St. Michael's Cathedral is 2,740 square metres.
The result:
PARAGRAPH:
0 0
PARAGRAPH: Called up for active service in 1939, Jackson served with No. 23
Squadron in Australia before he was posted to the Middle East in November
1940. A charity hockey game held to benefit former player Bill Heindl, Jr.
in 1980 was the only occasion that hockey legends Bobby Orr and Wayne. The
total floor area of St. Michael's Cathedral is 2,740 square metres.
0 61 Called up for active service in 1939, Jackson served with No.
PARAGRAPH: Called up for active service in 1939, Jackson served with No. 23
Squadron in Australia before he was posted to the Middle East in November
1940. A charity hockey game held to benefit former player Bill Heindl, Jr.
in 1980 was the only occasion that hockey legends Bobby Orr and Wayne. The
total floor area of St. Michael's Cathedral is 2,740 square metres.
62 144 23 Squadron in Australia before he was posted to the Middle East in
November 1940.
PARAGRAPH: Called up for active service in 1939, Jackson served with No. 23
Squadron in Australia before he was posted to the Middle East in November
1940. A charity hockey game held to benefit former player Bill Heindl, Jr.
in 1980 was the only occasion that hockey legends Bobby Orr and Wayne. The
total floor area of St. Michael's Cathedral is 2,740 square metres.
145 284 A charity hockey game held to benefit former player Bill Heindl, Jr.
in 1980 was the only occasion that hockey legends Bobby Orr and Wayne.
PARAGRAPH: Called up for active service in 1939, Jackson served with No. 23
Squadron in Australia before he was posted to the Middle East in November
1940. A charity hockey game held to benefit former player Bill Heindl, Jr.
in 1980 was the only occasion that hockey legends Bobby Orr and Wayne. The
total floor area of St. Michael's Cathedral is 2,740 square metres.
285 312 The total floor area of St.
PARAGRAPH: Called up for active service in 1939, Jackson served with No. 23
Squadron in Australia before he was posted to the Middle East in November
1940. A charity hockey game held to benefit former player Bill Heindl, Jr.
in 1980 was the only occasion that hockey legends Bobby Orr and Wayne. The
total floor area of St. Michael's Cathedral is 2,740 square metres.
313 356 Michael's Cathedral is 2,740 square metres.
So the small paragraph is split this way:
Called up for active service in 1939, Jackson served with No.
23 Squadron in Australia before he was posted to the Middle East in
November 1940.
A charity hockey game held to benefit former player Bill Heindl, Jr. in
1980 was the only occasion that hockey legends Bobby Orr and Wayne.
The total floor area of St.
Michael's Cathedral is 2,740 square metres.
1 paragraph of 3 sentences, split in 5 parts.
It splits after No., after St., but strangely not after Jr.
A paragraph is passed to the grammar checker as many times as the number of
sentences found. In this case, 5 times.
It’s not a proper behavior for the English language, and it provokes false
alarms in the French grammar checker. I didn’t test with other languages,
but I assume it’s probably not a good thing for them either, as we can’t be
sure a dot is the end of a sentence.
In LO 3.6, each paragraph was passed only once to the GC, and it was up to
the GC to split the paragraph or not.
Regards,
Olivier
** the modified Lightproof file **
# -*- encoding: UTF-8 -*-
# Lightproof grammar checker for LibreOffice and OpenOffice.org
# 2009-2012 (c) László Németh (nemeth at numbertext org), license: MPL 1.1 /
GPLv3+ / LGPLv3+
import uno, unohelper, os, sys, traceback
from lightproof_impl_en import locales
from lightproof_impl_en import pkg
import lightproof_impl_en
import lightproof_handler_en
from com.sun.star.linguistic2 import XProofreader, XSupportedLocales
from com.sun.star.linguistic2 import ProofreadingResult,
SingleProofreadingError
from com.sun.star.lang import XServiceInfo, XServiceName,
XServiceDisplayName
from com.sun.star.lang import Locale
# reload in obj.reload in Python 3
try:
from obj import reload
except:
pass
try:
sys.stdout = open("D:\_lightproof_stdout.txt", "w")
except:
pass
class Lightproof( unohelper.Base, XProofreader, XServiceInfo, XServiceName,
XServiceDisplayName, XSupportedLocales):
def __init__( self, ctx, *args ):
self.ctx = ctx
self.ServiceName = "com.sun.star.linguistic2.Proofreader"
self.ImplementationName = "org.libreoffice.comp.pyuno.Lightproof." +
pkg
self.SupportedServiceNames = (self.ServiceName, )
self.locales = []
for i in locales:
l = locales[i]
self.locales += [Locale(l[0], l[1], l[2])]
self.locales = tuple(self.locales)
currentContext = uno.getComponentContext()
lightproof_impl_en.SMGR = currentContext.ServiceManager
lightproof_impl_en.spellchecker = \
lightproof_impl_en.SMGR.createInstanceWithContext("com.sun.star.linguistic2.SpellChecker",
currentContext)
lightproof_handler_en.load(currentContext)
# XServiceName method implementations
def getServiceName(self):
return self.ImplementationName
# XServiceInfo method implementations
def getImplementationName (self):
return self.ImplementationName
def supportsService(self, ServiceName):
return (ServiceName in self.SupportedServiceNames)
def getSupportedServiceNames (self):
return self.SupportedServiceNames
# XSupportedLocales
def hasLocale(self, aLocale):
if aLocale in self.locales:
return True
for i in self.locales:
if (i.Country == aLocale.Country or i.Country == "") and
aLocale.Language == i.Language:
return True
return False
def getLocales(self):
return self.locales
# XProofreader
def isSpellChecker(self):
return False
def doProofreading(self, nDocId, rText, rLocale, nStartOfSentencePos,
nSuggestedSentenceEndPos, rProperties):
print("PARAGRAPH: ", rText)
print(nStartOfSentencePos, nSuggestedSentenceEndPos,
rText[nStartOfSentencePos:nSuggestedSentenceEndPos])
sys.stdout.flush()
aRes = uno.createUnoStruct(
"com.sun.star.linguistic2.ProofreadingResult" )
aRes.aDocumentIdentifier = nDocId
aRes.aText = rText
aRes.aLocale = rLocale
aRes.nStartOfSentencePosition = nStartOfSentencePos
aRes.nStartOfNextSentencePosition = nSuggestedSentenceEndPos
aRes.aProperties = ()
aRes.xProofreader = self
aRes.aErrors = ()
if len(rProperties) > 0 and rProperties[0].Name == "Update":
try:
import lightproof_compile_en
try:
code = lightproof_compile_en.c(rProperties[0].Value,
rLocale.Language, True)
except Exception as e:
aRes.aText, aRes.nStartOfSentencePosition = e
return aRes
path = lightproof_impl_en.get_path()
f = open(path.replace("_impl", ""), "w")
f.write("dic = %s" % code["rules"])
f.close()
if pkg in lightproof_impl_en.langrule:
mo = lightproof_impl_en.langrule[pkg]
reload(mo)
lightproof_impl_en.compile_rules(mo.dic)
lightproof_impl_en.langrule[pkg] = mo
if "code" in code:
f = open(path, "r")
ft = f.read()
f.close()
f = open(path, "w")
f.write(ft[:ft.find("# [code]") + 8] + "\n" +
code["code"])
f.close()
try:
reload(lightproof_impl_en)
except Exception as e:
aRes.aText = e.args[0]
if e.args[1][3] == "": # "expected an indented
block" (end of file)
aRes.nStartOfSentencePosition =
len(rText.split("\n"))
else:
aRes.nStartOfSentencePosition =
rText.split("\n").index(e.args[1][3][:-1]) + 1
return aRes
aRes.aText = ""
return aRes
except:
if 'PYUNO_LOGLEVEL' in os.environ:
print(traceback.format_exc())
l = rText[nSuggestedSentenceEndPos:nSuggestedSentenceEndPos+1]
while l == " ":
aRes.nStartOfNextSentencePosition =
aRes.nStartOfNextSentencePosition + 1
l =
rText[aRes.nStartOfNextSentencePosition:aRes.nStartOfNextSentencePosition+1]
if aRes.nStartOfNextSentencePosition == nSuggestedSentenceEndPos and
l!="":
aRes.nStartOfNextSentencePosition = nSuggestedSentenceEndPos + 1
aRes.nBehindEndOfSentencePosition =
aRes.nStartOfNextSentencePosition
try:
aRes.aErrors = lightproof_impl_en.proofread( nDocId, rText,
rLocale, \
nStartOfSentencePos, aRes.nBehindEndOfSentencePosition,
rProperties)
except Exception as e:
if len(rProperties) > 0 and rProperties[0].Name == "Debug" and
len(e.args) == 2:
aRes.aText, aRes.nStartOfSentencePosition = e
else:
if 'PYUNO_LOGLEVEL' in os.environ:
print(traceback.format_exc())
return aRes
def ignoreRule(self, rid, aLocale):
lightproof_impl_en.ignore[rid] = 1
def resetIgnoreRules(self):
lightproof_impl_en.ignore = {}
# XServiceDisplayName
def getServiceDisplayName(self, aLocale):
return lightproof_impl_en.name
g_ImplementationHelper = unohelper.ImplementationHelper()
g_ImplementationHelper.addImplementation( Lightproof, \
"org.libreoffice.comp.pyuno.Lightproof." + pkg,
("com.sun.star.linguistic2.Proofreader",),)
g_ImplementationHelper.addImplementation(
lightproof_handler_en.LightproofOptionsEventHandler, \
"org.libreoffice.comp.pyuno.LightproofOptionsEventHandler." + pkg,
("com.sun.star.awt.XContainerWindowEventHandler",),)
--
View this message in context: http://nabble.documentfoundation.org/Grammar-checker-Undocumented-change-in-the-API-for-LO-4-tp4030639p4039657.html
Sent from the Dev mailing list archive at Nabble.com.
More information about the LibreOffice
mailing list