[Libreoffice-bugs] [Bug 119944] Writer does not resolve some/most HTML entities.

bugzilla-daemon at bugs.documentfoundation.org bugzilla-daemon at bugs.documentfoundation.org
Fri Sep 28 00:21:13 UTC 2018


https://bugs.documentfoundation.org/show_bug.cgi?id=119944

--- Comment #2 from Jens Troeger <jens.troeger at light-speed.de> ---
I’ve updated the HTML file: the new one is generated from the w3 reference
webpage and should include all HTML entities in their text/hex/dec encodings.

The Python script I used to generate that file is commented into that same
file; notice, however, that Python’s html5 entity lookup is also incomplete
resulting in a "???" string rather than the proper text.

Poked around a bit here:

   
https://github.com/LibreOffice/core/blob/master/svtools/source/svhtml/parhtml.cxx#L394-L622

but it seems that the entity-aware string object messes things up.  The entity
parser itself looks ok to me.

-- 
You are receiving this mail because:
You are the assignee for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/libreoffice-bugs/attachments/20180928/65ad16cb/attachment-0001.html>


More information about the Libreoffice-bugs mailing list