[Libreoffice-bugs] [Bug 125298] New: FILESAVE DOCX Bookmark names and field references shortened in case they are 40 characters long and contain non ASCII characters
bugzilla-daemon at bugs.documentfoundation.org
bugzilla-daemon at bugs.documentfoundation.org
Wed May 15 09:02:27 UTC 2019
https://bugs.documentfoundation.org/show_bug.cgi?id=125298
Bug ID: 125298
Summary: FILESAVE DOCX Bookmark names and field references
shortened in case they are 40 characters long and
contain non ASCII characters
Product: LibreOffice
Version: 6.3.0.0.alpha0+ Master
Hardware: All
OS: All
Status: UNCONFIRMED
Severity: normal
Priority: medium
Component: Writer
Assignee: libreoffice-bugs at lists.freedesktop.org
Reporter: libreoffice at nisz.hu
Description:
In the OOXML standard, there is a limitation for the bookmark names and for
field references (the value of <w:instrText> tags) to maximum 40 characters.
There is an encode/decode mechanism in LibreOffice for non ASCII characters in
bookmark names and in field references, which mechanism creates more characters
from non-ascii characters. For example %C5%91 from ő.
If the truncation happens before the decoding, non ASCII characters will be
counted as more than one characters, which means bookmark names or field
references can be truncated if they contain non ASCII characters.
Steps to Reproduce:
1. Create some text
2. Select some section of the text
3. click on insert menu, select bookmark
4. give it a name which contains non-ASCII characters and long enough (for
example árvíztűrő tükörfúrógép, or 1é2á3ű4ő5ú6ö7ü8ó9í)
5. go to somewhere else in the document, for example to the end of
document, create a new paragraph
6. click on insert menu, select cross-reference
7. select the value "Bookmark" in "Type" listbox, then select the value
"Reference" in the "Insert reference to..." listbox
8. in the "Selection" listbox, double click on the previously named
bookmark
9. save the file as docx and reload it
10. rename the file to .zip instead of .docx, unzip it, and check out
document.xml in word folder
11. look at these tags:
<w:bookmarkStart w:name="something" w:id="0"/>
<w:instrText> REF something \h </w:instrText>
Actual Results:
Some bookmarks which are not longer than 40 characters will be truncated if
they contain non ASCII characters.
For example: 1é2á3ű4ő5ú6ö7ü8ó9í as a bookmark name will be truncated to
1é2á3ű4ő5ú6%C3%
and árvíztűrő tükörfúrógép as a bookmark name will be truncated to
árvíztűrő_tük%C
The cross references are still working despite the truncation, this is only a
cosmetic problem.
Expected Results:
In MS Word if a bookmark name contains non-ascii characters and its size is
below 41 characters it wont be truncated. We should emulate this behavior.
Reproducible: Always
User Profile Reset: No
Additional Info:
See also: 113483
--
You are receiving this mail because:
You are the assignee for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/libreoffice-bugs/attachments/20190515/3592188f/attachment-0001.html>
More information about the Libreoffice-bugs
mailing list