[Poppler-bugs] [Bug 97276] New: Can't extract text/html from PDF
bugzilla-daemon at freedesktop.org
bugzilla-daemon at freedesktop.org
Wed Aug 10 09:52:33 UTC 2016
https://bugs.freedesktop.org/show_bug.cgi?id=97276
Bug ID: 97276
Summary: Can't extract text/html from PDF
Product: poppler
Version: unspecified
Hardware: x86-64 (AMD64)
OS: Linux (All)
Status: NEW
Severity: major
Priority: medium
Component: pdftohtml
Assignee: poppler-bugs at lists.freedesktop.org
Reporter: clark at electrobeat.dk
pdftohtml doesn't extract the footer in this PDF
http://docdro.id/ms8RyMC
pdftohtml -s -i input.pdf /output
All the text in the bottom with small font size under the thick black
horizontal line is not extracted
The lowest part extracted is:
Forfaldsdato . . . . . . . . . . . . . . . . . . . . . . . . . . . . :
10/08-2016
--
You are receiving this mail because:
You are the assignee for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/poppler-bugs/attachments/20160810/5d960533/attachment.html>
More information about the Poppler-bugs
mailing list