[Poppler-bugs] [Bug 48012] New: cannot extract text

bugzilla-daemon at freedesktop.org bugzilla-daemon at freedesktop.org
Wed Mar 28 13:38:00 PDT 2012


https://bugs.freedesktop.org/show_bug.cgi?id=48012

             Bug #: 48012
           Summary: cannot extract text
    Classification: Unclassified
           Product: poppler
           Version: unspecified
          Platform: Other
        OS/Version: All
            Status: NEW
          Severity: normal
          Priority: medium
         Component: general
        AssignedTo: poppler-bugs at lists.freedesktop.org
        ReportedBy: jwilk at jwilk.net


Created attachment 59174
  --> https://bugs.freedesktop.org/attachment.cgi?id=59174
the test case

pdftotext correctly extracts Cyrillic part of the attached PDF; however, it
outputs garbage instead of the Latin part.

I can search through the Latin text in Adobe Reader, so the PDF itself is OK
(or at least not helplessly bad).

$ pdftotext -v
pdftotext version 0.18.4
Copyright 2005-2011 The Poppler Developers - http://poppler.freedesktop.org
Copyright 1996-2004 Glyph & Cog, LLC

-- 
Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.


More information about the Poppler-bugs mailing list