<html>
    <head>
      <base href="https://bugs.documentfoundation.org/">
    </head>
    <body><table border="1" cellspacing="0" cellpadding="8">
        <tr>
          <th>Bug ID</th>
          <td><a class="bz_bug_link 
          bz_status_UNCONFIRMED "
   title="UNCONFIRMED - Fix Hungarian sorting"
   href="https://bugs.documentfoundation.org/show_bug.cgi?id=116666">116666</a>
          </td>
        </tr>

        <tr>
          <th>Summary</th>
          <td>Fix Hungarian sorting
          </td>
        </tr>

        <tr>
          <th>Product</th>
          <td>LibreOffice
          </td>
        </tr>

        <tr>
          <th>Version</th>
          <td>Inherited From OOo
          </td>
        </tr>

        <tr>
          <th>Hardware</th>
          <td>All
          </td>
        </tr>

        <tr>
          <th>OS</th>
          <td>All
          </td>
        </tr>

        <tr>
          <th>Status</th>
          <td>UNCONFIRMED
          </td>
        </tr>

        <tr>
          <th>Severity</th>
          <td>normal
          </td>
        </tr>

        <tr>
          <th>Priority</th>
          <td>medium
          </td>
        </tr>

        <tr>
          <th>Component</th>
          <td>Localization
          </td>
        </tr>

        <tr>
          <th>Assignee</th>
          <td>libreoffice-bugs@lists.freedesktop.org
          </td>
        </tr>

        <tr>
          <th>Reporter</th>
          <td>nemeth@numbertext.org
          </td>
        </tr></table>
      <p>
        <div>
        <pre>Hungarian orthography rules contain the following extra requirements for
sorting words and sentences:

– expand simplified double consonants;

– ignore spaces and hyphens;

– prefer lower case homonyms.

(Source: <a href="http://helyesírás.mta.hu/helyesiras/default/akh12#F2_4">http://helyesírás.mta.hu/helyesiras/default/akh12#F2_4</a>)

Expansion of double consonants, (eg. sort “ccs” (long “cs”) as “cscs”) is still
not perfect, but in my analysis, it reduces the bad sorting positions by a
factor of 1/5, than ordering without explansion (3843 vs. 19425 in 4 million
word forms).

More important advantage, using full expansion it's possible to automatize
Hungarian sorting with manual (or in future, Hunspell based) preprocessing.
(Unfortunatelly, ICU collation algorithm alone is not enough for Hungarian,
yet.) Inserting soft hyphens is a quick workaround for here, too (as for the
similar problem of the single consonants, eg. “igazság” -> igaz­ság
(igaz[U+AD]ság) sorted before “igaztalan” correctly).</pre>
        </div>
      </p>


      <hr>
      <span>You are receiving this mail because:</span>

      <ul>
          <li>You are the assignee for the bug.</li>
      </ul>
    </body>
</html>