[Libreoffice-bugs] [Bug 119944] Writer does not resolve some/most HTML entities.
bugzilla-daemon at bugs.documentfoundation.org
bugzilla-daemon at bugs.documentfoundation.org
Fri Sep 28 00:21:13 UTC 2018
https://bugs.documentfoundation.org/show_bug.cgi?id=119944
--- Comment #2 from Jens Troeger <jens.troeger at light-speed.de> ---
I’ve updated the HTML file: the new one is generated from the w3 reference
webpage and should include all HTML entities in their text/hex/dec encodings.
The Python script I used to generate that file is commented into that same
file; notice, however, that Python’s html5 entity lookup is also incomplete
resulting in a "???" string rather than the proper text.
Poked around a bit here:
https://github.com/LibreOffice/core/blob/master/svtools/source/svhtml/parhtml.cxx#L394-L622
but it seems that the entity-aware string object messes things up. The entity
parser itself looks ok to me.
--
You are receiving this mail because:
You are the assignee for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/libreoffice-bugs/attachments/20180928/65ad16cb/attachment-0001.html>
More information about the Libreoffice-bugs
mailing list