[poppler] lack of CJK feature

Koji Otani sho at bbr.jp
Mon Dec 10 04:30:30 PST 2007


Hi All.

I'm Koji Otani, posted patches about CJK (see Bug #11413) before.
I found more problem about CJK. and registered patch as Bug #13582.

Adobe Japan1 6 character set includes characters in outside of UNICODE
BMP.
But poppler cannot display these and some characters with TrueType
font.
current poppler has following problems about that.
(1) CMap data is old.
 Current data (poppler-data-0.1.1.tar.gz ) has only Adobe Japan 4
 data.
 This should be update with newer one. (GhostScript 8.60 has  already 
 new  CMap data)
(2) poppler  doesn't lookup format12 cmap table of TrueType font.
  Only format12 cmap table supports codes outside of UNICODE BMP.
(3) poppler lookups only UCS2 CMaps when making unicodeToGID map
 UCS2 CMap supports only codes in inside of UNICODE BMP.
(4) missing  handling  CID conflict in CMap .
CMap maps multiple unicode to a same CID.
So, a CID can map multiple unicode. 
Currently poppler use only the first one.
If that code is not exist in the cmap of TT Font.
It is not displayed.

I proposed patch solved (2), (3), (4).

Please check it if you are interestead in this.
I would appreciate it if you accept this patch.

Reagrads,

Koji Otani


More information about the poppler mailing list