Copyright infringement and future of Hunspell

Németh László nemeth at numbertext.org
Wed Nov 15 17:19:45 UTC 2017


Hi,

A week ago you modified Hunspell’s license in the official Hunspell
repository
without permission of the author, me, and the main contributor and
maintainer,
Caolán McNamara.

==================================
commit d49170ce949dbe0d2e6ad74b6b876e5580704a5e
Author: Dimitrij Mijoski <dmjpp at hotmail.com>
Date:   Wed Nov 8 18:30:29 2017 +0100

    License everything under LGPLv3+. No more three licenses mumbo jumbo.

commit 6ff9a6fb5a63ee63294131eba7ce4e67624dffa5
Author: PanderMusubi <pander at users.sourceforge.net>
Date:   Wed Nov 8 16:45:35 2017 +0100

    improved copyright and authors
==================================

Free licenses and rich functionality helped Hunspell equally to
spread better multilingual spell checking among desktop and web
applications, so I don’t plan to replace the recent MPL/LGPL/GPL
tri-license with LGPL 3.

Moreover, it’s misleading to refer yourselves as the authors of Hunspell
(see your change in Hunspell’s AUTHORS file), when you are contributors
of the project.

If I right think, these modifications are related to your Mozilla funded
Hunspell
development, in which, unfortunately, I wasn’t able to take part in it,
and I didn’t follow  your Mozilla application last year. I read about its
success(?)
and your plan to create a spell checker from scratch only a few weeks ago.
(You have informed Caolán and me only about the first steps of the
application,
if I right know.)

>From its name and place in Hunspell repository, “Hunspell 2” is a
future replacement or successor of Hunspell library and command-line
executable, but it seems, it’s more like a fork of Hunspell development
efforts. According to your plan: “That aim for Hunspell 2.0 is to
recreate the most common functionality in Hunspell 1, and that is
detection and correction of spelling errors.”

Reimplementing a subset of the features and dropping dictionary formats
can result worse spell checking and dictionary incompatibilities between
applications (as I see in the case of Hungarian dictionary in your
project).

“Hunspell 2” won’t contain functions used by LibreOffice, main
target of Hunspell development. For example, every thesaurus uses Hunspell
for stemming, some of them also for morphological generation.

You promise the same spelling as in Hunspell, but you’ve already removed
all unit tests of Hunspell library to the dictionary “v1cmdline”.

Spell checking of LaTeX, HTML/XML and OpenDocument files will be also
“dropped” in your development, but this is a basic function of the
targeted academic publishing and automatized command-line document editing.

As the author of the half of Hunspell’s code base (the second half is
the work of Kevin Hendricks, author of MySpell), I don’t believe your
incomplete rewriting from scratch is a viable option with your limited
resources and experience (one C++ developer, insufficient knowledge
of the aim, usage and implementation of Hunspell features and
dictionaries).

[For example, you wrote the following about the LANG option of Hunspell
affix file in your analysis: “In the source code is no implementation
existing. Deprecate this option?”, while this option is really used
several places in language-specific parts of Hunspell. I have just added
support for special casing of Crimean Tatar language (extending the
Turkish and Azeri support – those were mentioned in Hunspell(5) manual
page), also adapted orthography changes in the special LANG_hu part of the
general compounding functions.]

See why trying to rewrite from scratch is a huge risk:
https://www.joelonsoftware.com/2000/04/06/things-you-should-never-do-part-i

Please, consider Caolán’s more than 700 Hunspell commits: excellent
and unique code-cleaning based on Red Hat, LibreOffice and Coverity bug
reports and – partly covering your aims – massive C++11 porting in
Hunspell library and command-line tool.

I think, the most important thing is to open Hunspell for more languages,
supporting research results of the academic sphere (see
https://pdfs.semanticscholar.org/dad3/5c719bb8bf5dffa8c757166fd1086be4d6c6.pdf
,
http://voikko.puimula.org/architecture.html), improving recent
dictionaries and creating competitive linguistic features, especially for
LibreOffice.

I’m glad of that I can work on the Hungarian Hunspell dictionary these
months supported by FSF.hu Foundation, Hungary, fixing some minor
problems in Hunspell and LibreOffice, too. Moreover, last week I
adapted an interesting Hunspell feature to LibreOffice. I think, this
“Grammar By” improvement of the user dictionaries will be quite useful
for professional Writer users in several languages:
https://wiki.documentfoundation.org/ReleaseNotes/6.0#.E2.80.9CGrammar_By.E2.80.9D_spell_checking
.

I would be glad of fixing the recent regression of the English thesauri
(morphological descriptions were removed by English dictionary update) in
LibreOffice, refining parts in Hunspell related to this and to the
“Grammar By” feature, giving frequency and pronunciation based
suggestions, avoiding overgeneration in compounding,
supporting agglutinative and other complex languages better, documenting
needs of the recent languages supported by LibreOffice and
adequacy of the related Hunspell features, etc.

I am still uncertain, what are the priorities of large-scale Hunspell
developments, and what’s possible to develop, but I’m quite sure,
there is a better way to develop Hunspell, than relicensing and
rewriting it from scratch.

I would be glad if we could talk about it in libreoffice-dev list – and
later, also in libreoffice-l10n.

Best regards,
Laszlo
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/libreoffice/attachments/20171115/eb8d828d/attachment.html>


More information about the LibreOffice mailing list