[Libreoffice-bugs] [Bug 49645] FILEOPEN particular MSWORD2008 .docx: misinterprets letters from Symbol font

bugzilla-daemon at freedesktop.org bugzilla-daemon at freedesktop.org
Tue May 8 19:44:34 CEST 2012


https://bugs.freedesktop.org/show_bug.cgi?id=49645

--- Comment #9 from Roman Eisele <bugs at eikota.de> 2012-05-08 10:44:34 PDT ---
@Rainer Bielefeld:
thank you for testing! So I know that I am not the only one seeing this issue
;-)

It's no surprise that you see other wrong glyphs ("two strange hooks") than
visible on my screenshot. I think that LibreOffice just takes the two glyphs
for U+F061 and U+F062 from some font which contains glyphs for these Unicode
codepoints (maybe from the 1st font in the list of installed fonts which
contains such glyphs), and this font will vary from installation to
installation, depending on the installed fonts.

Update:
Regarding the Unicode code points U+F061 and U+F062 and the corresponding XML
fragments from the .docx file (w:char="F061" and w:char="F062"), I have found
that Microsoft's TrueType version of the 'Symbol' font, at least version 1.60
(2005) delivered with Windows XP, actually contains alpha and beta at U+F061
and U+F062. Of course, *all* glyphs/letters in this font have Private Use Area
indices (U+F020 to U+F0FE), even the space/blank letter has the Unicode value
U+F020 instead of U+0020.

I don't know why Microsoft did not use the correct Unicode values instead (for
most symbols in the 'Symbol' font there are corresponding Unicode code points),
but this is not our problem. What matters here is just:

* MS Office and LibreOffice 3.4 on Windows take the alpha and beta from the
Windows 'Symbol' font, using the glyphs from the Private Use Area. This worked
fine, and that is no suprise.

* LibreOffice 3.4 and 3.5 on MacOS can not display the alpha and the beta
correctly because Apple's 'Symbol' font (at least version 6.1d7e3, dated
2009-05-12) does not contain that Private Use Area glyphs; it uses the correct
Unicode code points for most symbols instead. This is no surprise, too.

* LibreOffice 3.5 on Windows does not take the alpha and the beta from the
'Symbol' font anymore, instead from the 1st installed font it can find which
contains glyphs for the Private Use Area codepoints U+F061 and U+F062. This may
be just a consequence of the fact that it does not switch to the 'Symbol' font
for the two symbols anymore, but uses the main text font ('Cambria') instead
which does not contain alpha and beta at these codepoints.

So, what regards LibreOffice 3.5 for Windows, it may be just necessary to
switch to the Symbol font again, like in LibO 3.4.x. For MacOS, the solution is
a bit more complicated. When we encounter a <w:sym> tag like

<w:sym w:font="Symbol" w:char="F061"/>

and when w:font is 'Symbol' and w:char is >= F020, this index must be mapped to
the correct Unicode value using a replacement table like

F020 -> U+0020 # Space character
...
F061 -> U+03B1 # alpha
F062 -> U+03B2 # beta
...
F0C2 -> U+211C # Real part (of a complex number)
...

etc. I don't know about Linux, of course ...

Sorry for all these words, but I hope that they bring some light into this
issue.

-- 
Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.



More information about the Libreoffice-bugs mailing list