[Libreoffice-bugs] [Bug 48729] New: autocorrect limit. acor.dat with entry 65535: Loop and/or loss of acor data

bugzilla-daemon at freedesktop.org bugzilla-daemon at freedesktop.org
Sun Apr 15 13:19:50 CEST 2012


https://bugs.freedesktop.org/show_bug.cgi?id=48729

             Bug #: 48729
           Summary: autocorrect limit. acor.dat with entry 65535: Loop
                    and/or loss of acor data
    Classification: Unclassified
           Product: LibreOffice
           Version: LibO 3.3.0 Beta2
          Platform: All
        OS/Version: All
            Status: UNCONFIRMED
          Severity: normal
          Priority: medium
         Component: Linguistic component
        AssignedTo: libreoffice-bugs at lists.freedesktop.org
        ReportedBy: barta at quipo.it


------------------
INTRODUCTION
------------------
This bug report is the twin of OOo issue 87672 (
https://issues.apache.org/ooo/show_bug.cgi?id=87672 ) which I reported in March
2008 in OOo 2.4 (gosh!!! 5 years!!! times runs fast!!!). for this reason I
suggest not to change the bug title which is the same.

------------------
SUMMARY
------------------
autocorrect databases can store only 65534 per language
if you reach 65535 limit the file crashes and the database is erased.

so there are 2 problems

1- autocorrect entries number is limited

2- loss of data when that limit is reached

------------------
BACKGROUND
------------------
Some informations to get into it:

1- OOo/LibO autocorrect function is based on database of couple of entries (bad
spelling --> correct spelling) collected in an .xml file called
DocumentList.xml which is stored inside an acor_.dat file (basically it's a
.zip with .dat extension) under this Windows path ...User\LibreOffice
3\user\autocorr

2- there's one global .dat  file (acor_.dat) which applies autocorrect items in
any languge and a single .dat file for each single language and variant (i.e.
acor_en-GB.dat for british english,  acor_en-US.dat foramerican english,
acor_it-IT.dat for italian etc. etc.) which applies autocorrect items only if
you are writing in that particular language

3- the DocumentList.xml file containing all the entries is based on a 16-bit
structure so it has an upper limit of 2^16 = 65 536

4- for some reasons if you get close to that limit, with entry number 65 535,
the .xml file “implodes” and all the database is erased with loss of data.

------------------
POSSIBLE SOLUTIONS
------------------
1- rise the upper limit from 2^16 to higher value... if I remember correctly
once Calc had the same 2^16 limit for cell numbers which was upgraded to 2^20 =
1 048 576. Could something similar be done for autocorrect as well?

2- allow use of multiple .dat files for the same language (i.e.
acor_en-GB1.dat, acor_en-GB2.dat, acor_en-GB3.dat etc. etc.) each one with the
65K limit. Once one is full, you can move to another just as it already happens
with custom dictionaries... you can have more than one of those indeed.

------------------
SOURCE CODE
------------------
the autocorrect function is coded into the svxcorr.cxx file

http://docs.libreoffice.org/editeng/html/svxacorr_8cxx_source.html

------------------
TEST FILES
------------------
the .zip contains the acor_it-IT.dat and documentlist.xml files that crash in
OOo/LibO once you add another autocorrect entry

https://issues.apache.org/ooo/attachment.cgi?id=52416

-- 
Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.


More information about the Libreoffice-bugs mailing list