<html>
    <head>
      <base href="https://bugs.documentfoundation.org/">
    </head>
    <body><span class="vcard"><a class="email" href="mailto:xiscofauli@libreoffice.org" title="Xisco Faulí <xiscofauli@libreoffice.org>"> <span class="fn">Xisco Faulí</span></a>
</span> changed
          <a class="bz_bug_link 
          bz_status_NEW "
   title="NEW - Unicode pictographs are no longer shown"
   href="https://bugs.documentfoundation.org/show_bug.cgi?id=116731">bug 116731</a>
          <br>
             <table border="1" cellspacing="0" cellpadding="8">
          <tr>
            <th>What</th>
            <th>Removed</th>
            <th>Added</th>
          </tr>

         <tr>
           <td style="text-align:right;">Keywords</td>
           <td>possibleRegression
           </td>
           <td>bibisected, bisected, regression
           </td>
         </tr>

         <tr>
           <td style="text-align:right;">Status</td>
           <td>UNCONFIRMED
           </td>
           <td>NEW
           </td>
         </tr>

         <tr>
           <td style="text-align:right;">CC</td>
           <td>
                
           </td>
           <td>mst.lo@arcor.de, xiscofauli@libreoffice.org
           </td>
         </tr>

         <tr>
           <td style="text-align:right;">Ever confirmed</td>
           <td>
                
           </td>
           <td>1
           </td>
         </tr></table>
      <p>
        <div>
            <b><a class="bz_bug_link 
          bz_status_NEW "
   title="NEW - Unicode pictographs are no longer shown"
   href="https://bugs.documentfoundation.org/show_bug.cgi?id=116731#c3">Comment # 3</a>
              on <a class="bz_bug_link 
          bz_status_NEW "
   title="NEW - Unicode pictographs are no longer shown"
   href="https://bugs.documentfoundation.org/show_bug.cgi?id=116731">bug 116731</a>
              from <span class="vcard"><a class="email" href="mailto:xiscofauli@libreoffice.org" title="Xisco Faulí <xiscofauli@libreoffice.org>"> <span class="fn">Xisco Faulí</span></a>
</span></b>
        <pre>it seems to be only linux ...

Regression introduced by:

author  Michael Stahl <<a href="mailto:mstahl@redhat.com">mstahl@redhat.com</a>>       2017-09-07 23:01:26 +0200
committer       Michael Stahl <<a href="mailto:mstahl@redhat.com">mstahl@redhat.com</a>>       2017-09-07 23:22:11
+0200
commit  fc670f637d4271246691904fd649358ce2e7be59 (patch)
tree    0eee10cd701f0479d4ed8ca7287defefef6af29e
parent  554a79d793ee9546f71802643b79001749c3c695 (diff)
svtools: HTML import: don't put lone surrogates in OUString
The bytes "ed b3 b5" in fdo67610-1.doc (which, as the name indicates,
is an HTML file) are converted to the lone UTF-16 surrogate "dcf5",
which is inserted into SwTextNode and causes asserts later on.

The actual encoding of the HTML document is probably GBK (at least
VIM doesn't display any missing characters with that), but
because it doesn't contain any indication of its encoding
it's apparently imported as UTF-8; the ImplConvertUtf8ToUnicode()
thinking a surrogate code point is valid even if the JSON-compatible
mode RTL_TEXTENCODING_JAVA_UTF8 is not specified is a bit of a
surprise.

Bisected with: bibisect-linux64-6.0

Adding Cc: to Michael Stahl</pre>
        </div>
      </p>


      <hr>
      <span>You are receiving this mail because:</span>

      <ul>
          <li>You are the assignee for the bug.</li>
      </ul>
    </body>
</html>