[PATCH-3-5] fdo#35270 - kill first-use grammar checker freeze ...

Olivier R. olivier.noreply at gmail.com
Mon May 21 04:34:06 PDT 2012


Hello Michael,


Michael Meeks-2 wrote
> 
> 	Beyond that - I've no idea :-) presumably there are philosophical
> differences between lightproof and languagetool that I'm not clued-up
> on. It'd be interesting to hear what Laszlo & you think about it, and
> other people's views on the list ...
> 

The differences look imho more technical than philosophical, even if there
are specific features in both of them. They both parse text with a set of
written rules describing what might be mistakes. I already converted LT
rules for LP, and you can also convert LP rules for LT, but you can’t always
write exactly the same rules. LanguageTool checks all words in a sentence,
whereas Lightproof checks only what you ask to check, based on regex
triggers.

LT rules are written in a very formal way, in XML files.
LP rules are written with a lot of regex mixed with Python code.


One of the main differences is how the grammar checkers retrieve
informations about words.
Lightproof calls the spell checker Hunspell to know what a word might be
(part of speech, etc.), whereas LanguageTool bundles his own lexicon for
every language.

AFAIK, at the moment, only the Hungarian Hunspell dictionary and the French
one are grammatically tagged.
Although you can do grammar checking without grammatically tagged
dictionaries, rules describing mistakes will be limited.

The lack of grammatically tagged Hunspell dictionaries is one of the main
issue to improve grammar checking for other languages with Lightproof.



What hope is there for sharing work, rules, etc. ?
Writting rules for both GC is doable by a human, though it might be
difficult automatically.

Working on dictionaries or lexicons is a time-consuming work.
At the moment, the French Hunspell dictionary is converted regularly into a
lexicon for LT. That’s not really hard, as the Hunspell dictionary was
conceived for grammar checking. I don’t know if converting LT lexicons into
Hunspell dictionaries would be easily feasible. It probably depends on the
complexity of the language. I don’t even know if LT lexicons would be good
replacements for spell checking. Another solution might be to tag Hunspell
dictionaries with LT tags, but all Hunspell dictionaries were probably not
conceived with grammar checking in mind, and that’s probably more a dream
than a doable solution.

If Lightproof could use LT lexicons, that could be a temporary solution, but
maybe memory-consuming, as you would have a dictionary for spell checking
and a lexicon for grammar checking.



Would that offend the minimal modus operandi of lightproof ? could that
> light mode be made an option ?

Lightproof can work as hard as LT. It just depends on what you ask to it. :)
The Lightproof French grammar checker already does approximatively as much
as LT, and will do much more in few months, I hope. I work on that.

HTH.

Regards,
Olivier

--
View this message in context: http://nabble.documentfoundation.org/PATCH-3-5-fdo-35270-kill-first-use-grammar-checker-freeze-tp3983822p3984998.html
Sent from the Dev mailing list archive at Nabble.com.


More information about the LibreOffice mailing list