Crash during Traditional Chinese <-> Simplified Chinese conversion

Matteo Casalin matteo.casalin at gmx.com
Fri Jan 4 05:58:02 PST 2013


On 01/04/2013 01:07 PM, Matteo Casalin wrote:
> Hi Michael,
>
> On 01/03/2013 10:52 PM, Michael Stahl wrote:
>> On 03/01/13 19:26, Matteo Casalin wrote:
>>> Hi all,
>>>       I've been lately struggling with crash in conversion from
>>> Traditional to Simplified Chinese in Writer. After some debugging, I
>>> tracked the problem to access to released memory, but I don't know ho to
>>> proceed to solve the issue since it involves a deeper knowledge than I
>>> have about Writer internal structure.
>>> I really would appreciate if anybody could give me any hint on this.
>>> Here are the details:
>>> The conversion is handled by editeng::HangulHanjaConversion class, which
>>> is used as a base class for SwHHCWrapper (and is also derived in a
>>> parallel manner also in editeng itself). Without digging into details
>>> (the flow is quite convoluted), the problem arises in SwTxtNode::Convert
>>> (sw/source/core/txtnode/txtedt.cxx) as follow:
>>> * line 1074: instantiate a SwLanguageIterator object, which builds a
>>> list of pointers to non-copiable SwTxtAttr;
>>> * line 1111: call SetLanguageAndFont, which destroys the original
>>> SwTxtAttr items which the iterator still points to;
>>> * line 1117: access the now deleted iterator items.
>>
>> so SwTxtNode::SetLanguageAndFont calls InsertItemSet, with a
>> SvxFontItem, which will result in a RES_TXTATR_AUTOFMT hint... which may
>> be combined (in SwpHints::MergePortions) with an existing
>> RES_TXTATR_AUTOFMT that is adjacent to the insertion range (aCurPaM),
>> provided that the item set on the adjacent hint contains the same
>> attributes as the one on the insertion range.
>>
>> SwTxtNode::Convert appears clearly wrong to me in modifying the hints of
>> a text node while iterating over them.  (it is possible that this used
>> to work in 2005 or earlier; i don't know if equal text hints were
>> combined before the introduction of AUTOFMT, as i wasn't around back
>> then).
>>
>> perhaps the insertion could be delayed until after the loop?
>
> Thanks for the detailed reply - unluckily I have no skills in Writer and
> its internals yet, so I do not understand the implications of the
> AUTOFMT attribute. By looking at the code, my understanding of what
> needs to be done in order to postpone the insertion is:
> * iterate over attributes in order to find the portion with the desired
> language, without inserting new properties;
> * save the information about that portion;
> * if necessary, re-iterate over the attributes with SetLanguageAndFont

Obviously, this would not work since we will get the same issue on the 
second loop. Instead we can push PaM information (on which to 
SetFontAndLanguage) in a queue and then iterate on it after the main 
loop is completed. I'll start working on a patch and see what happens 
with the Chinese document I have available. If that works, I'll submit 
the patch to gerrit, hoping that someone could also apply broader tests 
on the results (I don't have any knowledge of Chinese).

> * return the previously saved information;
> This could be done only if properties that are inserted do not influence
> the search of the language in the first iterations or corrupt the saved
> information.
> Could this work? I really would like to fix the bug, but fear to break
> something else.
>
> Thanks again!
> Cheers
> Matteo
>
>> _______________________________________________
>> LibreOffice mailing list
>> LibreOffice at lists.freedesktop.org
>> http://lists.freedesktop.org/mailman/listinfo/libreoffice
>>
>
> _______________________________________________
> LibreOffice mailing list
> LibreOffice at lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/libreoffice
>



More information about the LibreOffice mailing list