[poppler] Encoding of font names
Albert Astals Cid
aacid at kde.org
Mon Aug 29 11:15:15 PDT 2011
A Dimarts, 30 d'agost de 2011, suzuki toshiya vàreu escriure:
> Hi,
Hi
>
> I appreciate your interest & effort about non-Unicode font names!
>
> Albert Astals Cid wrote:
> > Today I've been working on trying to fix the names reported by pdffonts
> > for non latin1 fonts, I have not got anything very clear while reading
> > the spec, but I understood that the BaseFont string is encoded using
> > the /Encoding encoding. This has worked fine for some files but not for
> > all like one that says
> > /BaseFont /#CB#CE#CC#E5
> > /Encoding /UniGB-UCS2-H
> > If i try to map that to Unicode i get nothing. And Adobe Reader properly
> > maps that to 宋体
>
> Although I've not tested comprehensively yet, I guess
> Adobe implementation has some heuristic workaround for
> the font names coded by legacy localization mechanism.
>
> 0xCB 0xCE 0xCC 0xE5 is GB-2312 encoding of 宋体.
Yeah, i know
>
> # you can check as:
> # perl -le '{printf("%c%c%c%c\n", 0xCB, 0xCE, 0xCC, 0xE5);}' | iconv -f gbk
> -t utf-8
>
> I guess, Adobe implementation processes as following:
>
> 1) check font name if it is in hexadecimal syntax "/#xx#xx#xx..."
> 2) if its encoding is one of the predefined CJK CMaps,
> try to decode the font name by
> Adobe-CNS1 -> Big5
> Adobe-GB1 -> GB-2312 (or GBK)
> Adobe-Japan1 or Adobe-Japan2 -> Shift_JIS (or Windows-31J)
> Adobe-Korea1 -> Wansung
>
> Fortunately, core part of these legacy localizations are
> almost same in MS Windows and Mac OS, the coverage of possible
> legacy encoding is not so wide.
>
> > Any idea what is the proper manipulation one has to do over BaseFont to
> > get the Unicode value?
>
> I think if we can request iconv for the users who are interested
> in non-Unicode or non-ASCII font name, the conversion is not so
> difficult.
Using iconv from the code seems like a bit of a huge hack to me
> One of my concern is that I don't know about the handling of non-
> CJK (or CJK-but-not-predefined) localized font names, like,
> Adobe-Vietnam1, etc.
>
> This is urgent issue?
Not at all, i just stumbled upon it today and worked on it, but it is not
urgent since it has been broken forever :D
Albert
> If not, I will try to write some workaround
> for this issue.
>
> Regards,
> mpsuzuki
> _______________________________________________
> poppler mailing list
> poppler at lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/poppler
More information about the poppler
mailing list