[poppler] Fwd: [Bug 690722] Problem with the font embedding logic

Thu Aug 20 03:57:40 PDT 2009

Hi,
users of pdftex & luatex frequently get the message

Error: Illegal entry in bfchar block in ToUnicode CMap

Apparently some (mostly harmless) buglet in xpdf...

Best
   Martin
---------- Forwarded message ----------
From:  <bugs.ghostscript.com-bugzilla-daemon at ghostscript.com>
Date: Thu, Aug 20, 2009 at 12:29 PM
Subject: [Bug 690722] Problem with the font embedding logic
To: luigi.scarso at gmail.com

http://bugs.ghostscript.com/show_bug.cgi?id=690722

ken.sharp at artifex.com changed:

          What    |Removed                     |Added
----------------------------------------------------------------------------
            Status|UNCONFIRMED                 |RESOLVED
        Resolution|                            |INVALID

------- Additional Comments From ken.sharp at artifex.com  2009-08-20 03:29 -------
This appears to be an error with xpdf, not Ghostscript. The original file
includes a ToUnicode CMap which uses the ~bfchar operators:

0 beginbfrange
endbfrange
5 beginbfchar
<0004> <0021>
<002B> <0048>
<0048> <0065>
<004F> <006C>
<0052> <006F>
endbfchar

The output from pdfwrite encodes the same information but uses the ~bfrange
operators instead:

5 beginbfrange
<04><04><0021>
<2b><2b><0048>
<48><48><0065>
<4f><4f><006c>
<52><52><006f>
endbfrange

Now this is nominally slightly less efficient, since it requires a start and end
range, but the spec does not say (Technical note 5014 Adobe CMap and CIDFont
files specification) that these cannot be the same (p72 and 73 of the spec).

In fact it appears that xpdf is assuming that the start and end codes will be 4
bytes long. If I modify the entries thus:

5 beginbfrange
<0004><0004><0021>
<002b><002b><0048>
<0048><0048><0065>
<004f><004f><006c>
<0052><0052><006f>
endbfrange

The problem disappears. The spec says:

"Values for srcCodeLo and srcCodeHi must be in hexadecimal notation. "

There is no apparent requirement for null padding, so this requirement by xpdf
would seem to be incorrect. Further, Adobe Acrobat is capable of opening both
files *and* searching for the text in both cases. Since Acrobat requires the
ToUnicode table to search for text in a CIDFont it would seem that Adobe do not
require this either.

------- You are receiving this mail because: -------