[Poppler-bugs] [Bug 35468] New: pdftotext cannot extract text from specific pdf

bugzilla-daemon at freedesktop.org bugzilla-daemon at freedesktop.org
Sun Mar 20 08:47:49 PDT 2011


           Summary: pdftotext cannot extract text from specific pdf
           Product: poppler
           Version: unspecified
          Platform: x86-64 (AMD64)
        OS/Version: Linux (All)
            Status: NEW
          Severity: major
          Priority: medium
         Component: general
        AssignedTo: poppler-bugs at lists.freedesktop.org
        ReportedBy: ulrich.leodolter at obvsg.at


pdftotext fails to extract text from specific pdf (see attachment).
exit status is 0 and no warnings or errors are reported.
the output file contains only 99 page break characters (0x0c).

i am sure the pdf contains text because when is save the document
using acrobat reader as text then plenty of text is extracted and saved.

i can also view the document on linux (centos 5.5 and fedora core 14)
using evince without problems.

i tried the following versions, all gave the same result.

poppler 0.5.4   centos 5.5   x86_64
poppler 0.14.5  fedora fc14  x86_64
poppler git     fedora fc14  x86_64

best regards

Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

More information about the Poppler-bugs mailing list