<html>
    <head>
      <base href="https://bugs.documentfoundation.org/">
    </head>
    <body><table border="1" cellspacing="0" cellpadding="8">
        <tr>
          <th>Bug ID</th>
          <td><a class="bz_bug_link 
          bz_status_UNCONFIRMED "
   title="UNCONFIRMED - SPELL: af_ZA.aff quietly broken"
   href="https://bugs.documentfoundation.org/show_bug.cgi?id=126311">126311</a>
          </td>
        </tr>

        <tr>
          <th>Summary</th>
          <td>SPELL: af_ZA.aff quietly broken
          </td>
        </tr>

        <tr>
          <th>Product</th>
          <td>LibreOffice
          </td>
        </tr>

        <tr>
          <th>Version</th>
          <td>6.3.0.1 rc
          </td>
        </tr>

        <tr>
          <th>Hardware</th>
          <td>All
          </td>
        </tr>

        <tr>
          <th>OS</th>
          <td>All
          </td>
        </tr>

        <tr>
          <th>Status</th>
          <td>UNCONFIRMED
          </td>
        </tr>

        <tr>
          <th>Severity</th>
          <td>normal
          </td>
        </tr>

        <tr>
          <th>Priority</th>
          <td>medium
          </td>
        </tr>

        <tr>
          <th>Component</th>
          <td>Linguistic
          </td>
        </tr>

        <tr>
          <th>Assignee</th>
          <td>libreoffice-bugs@lists.freedesktop.org
          </td>
        </tr>

        <tr>
          <th>Reporter</th>
          <td>elmar.braun@sh-p.de
          </td>
        </tr>

        <tr>
          <th>CC</th>
          <td>sophi@libreoffice.org
          </td>
        </tr></table>
      <p>
        <div>
        <pre>Description:
The file af_ZA.aff from the Afrikaans dictionary contains lines such as:

SFX J   0  etjie   ^.{1,3}[aeiouyëê]ng

<a href="https://cgit.freedesktop.org/libreoffice/dictionaries/tree/af_ZA/af_ZA.aff#n144">https://cgit.freedesktop.org/libreoffice/dictionaries/tree/af_ZA/af_ZA.aff#n144</a>

As far as I can gather from hunspell's documentation, that last string permits
a regex-*like* format, but not a full regex. Specifically the "^" anchor
appears to be unsupported, and the "^" character only recognized for negated
classes such as "[^abc]".

I've tried loading that dictionary with hunspell 1.7.0, compiled with MSVC
2015.3, with STL iterator debugging enabled. The iterator debugging asserts on
line 4360 of hunspell's affixmgr.cxx while processing the above SFX statement.

<a href="https://github.com/hunspell/hunspell/blob/v1.7.0/src/hunspell/affixmgr.cxx#L4360">https://github.com/hunspell/hunspell/blob/v1.7.0/src/hunspell/affixmgr.cxx#L4360</a>

Hunspell here uses a reverse_iterator to iterate over an already reversed copy
of the string "^.{1,3}[aeiouyëê]ng", and attempts to inspect the character
preceding the "^", which would dereference the invalid iterator
string.rbegin()-1.

Of course a release build would quietly do the out-of-bounds access. I wasn't
able to force any misbehavior in 6.3.0.1 (which, unlike 6.2.5, contains the
broken dictionary). But I don't speak Afrikaans, so I can't ascertain to what
degree the dictionary is actually doing what it's supposed to do.

Steps to Reproduce:
1. build hunspell with iterator debugging
2. load af_ZA dictionary

Actual Results:
iterator debugging reports out-of-bounds access

Expected Results:
loading dictionary succeeds


Reproducible: Always


User Profile Reset: No



Additional Info:</pre>
        </div>
      </p>


      <hr>
      <span>You are receiving this mail because:</span>

      <ul>
          <li>You are the assignee for the bug.</li>
      </ul>
    </body>
</html>