[Poppler-bugs] [Bug 107317] New: Fix HtmlFont::HtmlFilter to not lose tabs
bugzilla-daemon at freedesktop.org
bugzilla-daemon at freedesktop.org
Sat Jul 21 03:46:43 UTC 2018
https://bugs.freedesktop.org/show_bug.cgi?id=107317
Bug ID: 107317
Summary: Fix HtmlFont::HtmlFilter to not lose tabs
Product: poppler
Version: unspecified
Hardware: x86-64 (AMD64)
OS: Linux (All)
Status: NEW
Severity: normal
Priority: medium
Component: utils
Assignee: poppler-bugs at lists.freedesktop.org
Reporter: ulatekh at yahoo.com
Created attachment 140749
--> https://bugs.freedesktop.org/attachment.cgi?id=140749&action=edit
Patch to fix bug
I'm about to use pdftohtml to extract information from PDFs and organize the
results into a database, so I had a chance to dig through the code.
I've had a long-standing problem with qpdfview (which uses poppler) sometimes
copying text out of PDFs incorrectly -- the text copies, but all of the spaces
are missing. After reproducing it with a PDF, I tracked the problem down to the
PDF using tabs where it probably should have used spaces. The patch fixes
HtmlFont::HtmlFilter() to convert incoming tabs to spaces, instead of removing
the whitespace completely.
There are probably other places in the code where the fix in this patch could
be applied, e.g. when copying text in qpdfview.
--
You are receiving this mail because:
You are the assignee for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/poppler-bugs/attachments/20180721/fa9dafd3/attachment.html>
More information about the Poppler-bugs
mailing list