<html>
    <head>
      <base href="https://bugs.freedesktop.org/">
    </head>
    <body><table border="1" cellspacing="0" cellpadding="8">
        <tr>
          <th>Bug ID</th>
          <td><a class="bz_bug_link 
          bz_status_NEW "
   title="NEW - Fix HtmlFont::HtmlFilter to not lose tabs"
   href="https://bugs.freedesktop.org/show_bug.cgi?id=107317">107317</a>
          </td>
        </tr>

        <tr>
          <th>Summary</th>
          <td>Fix HtmlFont::HtmlFilter to not lose tabs
          </td>
        </tr>

        <tr>
          <th>Product</th>
          <td>poppler
          </td>
        </tr>

        <tr>
          <th>Version</th>
          <td>unspecified
          </td>
        </tr>

        <tr>
          <th>Hardware</th>
          <td>x86-64 (AMD64)
          </td>
        </tr>

        <tr>
          <th>OS</th>
          <td>Linux (All)
          </td>
        </tr>

        <tr>
          <th>Status</th>
          <td>NEW
          </td>
        </tr>

        <tr>
          <th>Severity</th>
          <td>normal
          </td>
        </tr>

        <tr>
          <th>Priority</th>
          <td>medium
          </td>
        </tr>

        <tr>
          <th>Component</th>
          <td>utils
          </td>
        </tr>

        <tr>
          <th>Assignee</th>
          <td>poppler-bugs@lists.freedesktop.org
          </td>
        </tr>

        <tr>
          <th>Reporter</th>
          <td>ulatekh@yahoo.com
          </td>
        </tr></table>
      <p>
        <div>
        <pre>Created <span class=""><a href="attachment.cgi?id=140749" name="attach_140749" title="Patch to fix bug">attachment 140749</a> <a href="attachment.cgi?id=140749&action=edit" title="Patch to fix bug">[details]</a></span> <a href='page.cgi?id=splinter.html&bug=107317&attachment=140749'>[review]</a>
Patch to fix bug

I'm about to use pdftohtml to extract information from PDFs and organize the
results into a database, so I had a chance to dig through the code.

I've had a long-standing problem with qpdfview (which uses poppler) sometimes
copying text out of PDFs incorrectly -- the text copies, but all of the spaces
are missing. After reproducing it with a PDF, I tracked the problem down to the
PDF using tabs where it probably should have used spaces. The patch fixes
HtmlFont::HtmlFilter() to convert incoming tabs to spaces, instead of removing
the whitespace completely.

There are probably other places in the code where the fix in this patch could
be applied, e.g. when copying text in qpdfview.</pre>
        </div>
      </p>


      <hr>
      <span>You are receiving this mail because:</span>

      <ul>
          <li>You are the assignee for the bug.</li>
      </ul>
    </body>
</html>