<html>
    <head>
      <base href="https://bugs.freedesktop.org/" />
    </head>
    <body><span class="vcard"><a class="email" href="mailto:albbas@gmail.com" title="albbas@gmail.com">albbas@gmail.com</a>
</span> changed
              <a class="bz_bug_link 
          bz_status_NEW "
   title="NEW - pdftohtml produces wrongly nested tags"
   href="https://bugs.freedesktop.org/show_bug.cgi?id=89239">bug 89239</a>
          <br>
             <table border="1" cellspacing="0" cellpadding="8">
          <tr>
            <th>What</th>
            <th>Removed</th>
            <th>Added</th>
          </tr>

         <tr>
           <td style="text-align:right;">Attachment #114252 is obsolete</td>
           <td>
                
           </td>
           <td>1
           </td>
         </tr></table>
      <p>
        <div>
            <b><a class="bz_bug_link 
          bz_status_NEW "
   title="NEW - pdftohtml produces wrongly nested tags"
   href="https://bugs.freedesktop.org/show_bug.cgi?id=89239#c8">Comment # 8</a>
              on <a class="bz_bug_link 
          bz_status_NEW "
   title="NEW - pdftohtml produces wrongly nested tags"
   href="https://bugs.freedesktop.org/show_bug.cgi?id=89239">bug 89239</a>
              from <span class="vcard"><a class="email" href="mailto:albbas@gmail.com" title="albbas@gmail.com">albbas@gmail.com</a>
</span></b>
        <pre>Created <span class=""><a href="attachment.cgi?id=114284" name="attach_114284" title="Fix opening and ending tag mismatch when resulting xml document contains invalid xml chars">attachment 114284</a> <a href="attachment.cgi?id=114284&action=edit" title="Fix opening and ending tag mismatch when resulting xml document contains invalid xml chars">[details]</a></span>
Fix opening and ending tag mismatch when resulting xml document contains
invalid xml chars

The difference between this patch and patch 114242 is this:
diff --git a/utils/HtmlOutputDev.cc b/utils/HtmlOutputDev.cc
index d725578..4915030 100644
--- a/utils/HtmlOutputDev.cc
+++ b/utils/HtmlOutputDev.cc
@@ -480,14 +480,16 @@ static bool tag_exists( std::list<std::string> tags,
std::string tag )

 static void CloseTag(GooString *htext, std::list<std::string> &tags,
std::string tag)
 {
+    size_t index = strlen(htext->getCString());
     while( !tags.empty() && tags.back() != tag ) {
         std::string current_tag = tags.back();
-        htext->append(current_tag.c_str(), current_tag.length());
+        htext->insert(index, current_tag.c_str());
+        index += current_tag.length();
         tags.pop_back();
     }
     if( !tags.empty()) {
       std::string current_tag = tags.back();
-      htext->append(current_tag.c_str(), current_tag.length());
+      htext->insert(index, current_tag.c_str());
       tags.pop_back();
     }
 }

When the htext variable contains what produces the "PCDATA invalid Char value"
errors in xmllint, the append function does not work as it should.

To force the ending tags to be appended to the GooString, the insert function
is used instead.

This produces output that does not have the "opening and ending mismatch" and
"premature end of data in tag" errors when ran through xmllint.</pre>
        </div>
      </p>
      <hr>
      <span>You are receiving this mail because:</span>
      
      <ul>
          <li>You are the assignee for the bug.</li>
      </ul>
    </body>
</html>