[Poppler-bugs] [Bug 96932] New: Improper text extraction from this pdf
bugzilla-daemon at freedesktop.org
bugzilla-daemon at freedesktop.org
Thu Jul 14 15:02:20 UTC 2016
https://bugs.freedesktop.org/show_bug.cgi?id=96932
Bug ID: 96932
Summary: Improper text extraction from this pdf
Product: poppler
Version: unspecified
Hardware: All
OS: All
Status: NEW
Severity: blocker
Priority: medium
Component: utils
Assignee: poppler-bugs at lists.freedesktop.org
Reporter: mingodad at gmail.com
Created attachment 125069
--> https://bugs.freedesktop.org/attachment.cgi?id=125069&action=edit
A pdf with tables
Hello !
I'm testing pdftotxt with pdfs from
http://www.docidadesp.imprensaoficial.com.br and there is several of then that
seems to have mixed encodings (I gues) and outputs garbage for some of it's
content (PDFxStream do the same).
See the attached pdf for test.
I hope the attached example can help improve poppler.
Cheers !
--
You are receiving this mail because:
You are the assignee for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/poppler-bugs/attachments/20160714/b7635105/attachment.html>
More information about the Poppler-bugs
mailing list