<html>
    <head>
      <base href="https://bugs.documentfoundation.org/">
    </head>
    <body>
      <p>
        <div>
            <b><a class="bz_bug_link 
          bz_status_NEW "
   title="NEW - Writer does not resolve some/most HTML entities."
   href="https://bugs.documentfoundation.org/show_bug.cgi?id=119944#c2">Comment # 2</a>
              on <a class="bz_bug_link 
          bz_status_NEW "
   title="NEW - Writer does not resolve some/most HTML entities."
   href="https://bugs.documentfoundation.org/show_bug.cgi?id=119944">bug 119944</a>
              from <span class="vcard"><a class="email" href="mailto:jens.troeger@light-speed.de" title="Jens Troeger <jens.troeger@light-speed.de>"> <span class="fn">Jens Troeger</span></a>
</span></b>
        <pre>I’ve updated the HTML file: the new one is generated from the w3 reference
webpage and should include all HTML entities in their text/hex/dec encodings.

The Python script I used to generate that file is commented into that same
file; notice, however, that Python’s html5 entity lookup is also incomplete
resulting in a "???" string rather than the proper text.

Poked around a bit here:

   
<a href="https://github.com/LibreOffice/core/blob/master/svtools/source/svhtml/parhtml.cxx#L394-L622">https://github.com/LibreOffice/core/blob/master/svtools/source/svhtml/parhtml.cxx#L394-L622</a>

but it seems that the entity-aware string object messes things up.  The entity
parser itself looks ok to me.</pre>
        </div>
      </p>


      <hr>
      <span>You are receiving this mail because:</span>

      <ul>
          <li>You are the assignee for the bug.</li>
      </ul>
    </body>
</html>