[Libreoffice-bugs] [Bug 116666] New: Fix Hungarian sorting
bugzilla-daemon at bugs.documentfoundation.org
bugzilla-daemon at bugs.documentfoundation.org
Tue Mar 27 19:19:06 UTC 2018
https://bugs.documentfoundation.org/show_bug.cgi?id=116666
Bug ID: 116666
Summary: Fix Hungarian sorting
Product: LibreOffice
Version: Inherited From OOo
Hardware: All
OS: All
Status: UNCONFIRMED
Severity: normal
Priority: medium
Component: Localization
Assignee: libreoffice-bugs at lists.freedesktop.org
Reporter: nemeth at numbertext.org
Hungarian orthography rules contain the following extra requirements for
sorting words and sentences:
– expand simplified double consonants;
– ignore spaces and hyphens;
– prefer lower case homonyms.
(Source: http://helyesírás.mta.hu/helyesiras/default/akh12#F2_4)
Expansion of double consonants, (eg. sort “ccs” (long “cs”) as “cscs”) is still
not perfect, but in my analysis, it reduces the bad sorting positions by a
factor of 1/5, than ordering without explansion (3843 vs. 19425 in 4 million
word forms).
More important advantage, using full expansion it's possible to automatize
Hungarian sorting with manual (or in future, Hunspell based) preprocessing.
(Unfortunatelly, ICU collation algorithm alone is not enough for Hungarian,
yet.) Inserting soft hyphens is a quick workaround for here, too (as for the
similar problem of the single consonants, eg. “igazság” -> igazság
(igaz[U+AD]ság) sorted before “igaztalan” correctly).
--
You are receiving this mail because:
You are the assignee for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/libreoffice-bugs/attachments/20180327/95bdcf99/attachment.html>
More information about the Libreoffice-bugs
mailing list