system-wide location for dictionary files and dictionary file name format

Caolán McNamara caolanm at redhat.com
Wed Feb 15 12:42:22 UTC 2017


On Tue, 2017-02-14 at 22:28 +0100, Erik Quaeghebeur wrote:
> Caolán McNamara 2017-2-13 10:31:
> > 
> > For dictionaries (and hyphenation patterns and thesaurus things)
> > under
> > linux we also check for system installed ones. This is
> > DICT_SYSTEM_DIR
> > in configure.ac and lingucomponent/source/lingutil/lingutil.cxx.
> 
> Thanks for your elaborate response. I guess that Redhat doesn't use
> the dicollecte French dictionaries, if the oxt is not installed, but
> just some dictionaries get dumped into DICT_SYSTEM_DIR.

For Fedora we use http://www.dicollecte.org/download/fr/hunspell-french
-dictionaries-v6.0.2.zip and just unzip it and rename the
toutesvariantes.* to fr_FR.* to serve as the default French spelling
dictionary.

> Hmm, it seems language tags such as ‘nl’ or ‘fr’ without a region
> component, and which are valid according to bcp47 are not recognized
> by LO. Is this a bug I should report?

Only if you intend to work on solving it, is my opinion. It would be
nice to support a bare "language" dictionary and have it in use for all
variants unless there are more specific variants to use, but I don't
intend to do it myself, or think my way through hacking firefox,
enchant and the rest of the things that parse filenames in
/usr/share/myspell|hunspell to also support that.

> > So for the original question, the answer for installing system wide
> > dictionaries at a distro level is probably to put the .dic and .aff
> > into /usr/share/hunspell and it'll "just work". Special variants
> > need to be named in a bcp47 format to have a chance of getting
> > picked up right, but that's a lesser used codepath so mileage many
> > vary.
> 
> OK, but after reading through the bcp47 RFC, I have the impression
> that only private-use tags for the earlier French example could work:
> fr-x-classique, fr-x-moderne, fr-x-toutesvariantes, and fr-x-
> reforme1990 then, with the possibility of registering fr-1990, it
> seems. I've tried it, and they're not seen by LO.

Yeah, "your mileage may vary" I guess kicks in there. The main "real-
world" use case for BCP47 is to distinguish Cyrillic vs Latin Serbian.
Couple of things. Firstly, you only get 8 letters for the part after x-
so fr-x-moderne is valid bcp-47 while fr-x-classique is not (but fr-x-
classiqu would be). If I add fr-x-classiqu.aff|dic then I see they are
successfully added to the list of dictionaries available to
LibreOffice.

Secondly, what the rest of LibreOffice does with this then is probably
still a little unclear in parts. I see that in our format character
dialog where we can directly enter bcp47 that I can successfully enter
a tag like de-1996 but not fr-x-whatever so it appears the the private
use tags are not allowed there for some additional reason I don't know
(@erack?)

Anyhow, in fedora wrt french spelling we just took the recommended
dictionary and set it as the default system wide fr_FR.

C.


More information about the LibreOffice mailing list