[Poppler-bugs] [Bug 54268] New: problem copy/pasting CID? / Identity-H? text
bugzilla-daemon at freedesktop.org
bugzilla-daemon at freedesktop.org
Thu Aug 30 07:50:29 PDT 2012
https://bugs.freedesktop.org/show_bug.cgi?id=54268
Bug #: 54268
Summary: problem copy/pasting CID? / Identity-H? text
Classification: Unclassified
Product: poppler
Version: unspecified
Platform: Other
OS/Version: All
Status: NEW
Severity: normal
Priority: medium
Component: general
AssignedTo: poppler-bugs at lists.freedesktop.org
ReportedBy: fpeters at 0d.be
I got a whole lot of PDF files where poppler somehow fails (example at
<http://people.gnome.org/~fpeters/pdf-identity-h-bug.pdf>).
The first page is ok but then it got a second page attached, with a single
word, in a monospace font (looking in document properties in poppler it's
"FreeMono, Truetype (CID), encoded as Identity-H"). That word is displayed
correctly but converted to something entirely different when copy/pasting from
evince, or using the pdftotext or pdftohtml entities.
The displayed word is "tapiraient" while the word extracted as text is
"WDSLUDLHQW". In the serie of documents I have, other examples give:
DQJRLVVHUD -> angoissera
HQDPRXUHU -> enamourer
FRQWUHFDUUDLW -> contrecarrait
It looks like the mapping is always the same, and letters are kept in the same
order (ex: D->a, E->?, F->c, G->?, H->e...); I checked poppler-data and there
is CMap/Identity-H but I couldn't figure if it's used, or relevant.
--
Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.
More information about the Poppler-bugs
mailing list