[poppler] Support for CJK fonts in postscript / on Windows platforms

Thomas Freitag Thomas.Freitag at kabelmail.de
Fri Mar 23 14:28:27 PDT 2012


Am 23.03.2012 14:06, schrieb mpsuzuki at hiroshima-u.ac.jp:
> On Fri, 23 Mar 2012 08:42:01 +0100
> Thomas Freitag<Thomas.Freitag at kabelmail.de>  wrote:
>> Am 23.03.2012 08:21, schrieb suzuki toshiya:
>>> Excuse me, this is the 3rd issue in your first post (add a support
>>> to reflect cidfmap generated by ghostscript), that is not what I care.
>>> What I care is about hardwired MS-Mincho fallback.
>> It's just hard wired if a CID font is expected and no appropiate
>> substitute font is found. Propablay a better idea is to use arialuni.ttf
>> instead of MS Mincho, but when I started with it, I only knew that MS
>> Mincho is always installed and has some CJK chars.
> One of the important problem in using single CJK font (e.g.
> MS Mincho) as generic fallback is that the coverage of the
> characters of CJK fonts are highly dependent with the assumed
> market.
>
> For example, the fonts designed for China mainland, Taiwan
> and Japan are usually missing Hangul. They should not be
> used for Adobe-Korea1 fallback. In addition, the fonts
> designed for Japan, Taiwan, Korea are usually missing the
> simplified characters currently used in China mainland.
> Also, the latest version of Adobe-GB1 includes Yi script
> (U+A000 - U+A4BF), but the fonts for Taiwan, Japan and Korea
> are usually missing them.
>
> Nothing to say, for Japanese customers, using MS Mincho or
> MS Gothic as generic fallback would be better than using
> SimSun (for China mainland), MingLiU (for Taiwan) or Batang
> (for Korea) as generic fallback, but it is unfair solution.
>
> Using Arial Unicode as generic fallback would be neutral,
> although its typeface quality for CJK scripts is often
> disrespected. In addition, its vertical writing mode support
> is insufficient.
>
> I attached 1 PDF and 2 pictures; one picture is fallbacked
> by MS Mincho, another picture is fallbacked to Arial Unicode.
>
> Thus, I will propose a patch to prepare per-collection
> fallback fonts (for Adobe-CNS1, Adobe-GB1, Adobe-Japan1,
> Adobe-Japan2, Adobe-Korea1) and finally fallback to Arial
> Unicode when no appropriate one is found.
In my opinion: sounds great. I implemented my "poor" patch because I 
sometimes debug problems with CJK fonts under Windows and see nothing. 
So a better implementation is always welcome for me.
>
>>>> I'm not really sure where the poppler data dir ist expected on MinGW,
>>>> should be /usr/local/share/poppler, otherwise You can patch the code
>>>> where the GlobalParam constructor is called , I do it normally under
>>>> windows:
>>>>
>>>> globalParams = new
>>>> GlobalParams("E:\\Downloads\\poppler\\poppler-data-0.4.5");
>>>>
>>>> and copy  cidfmap to that directory.
>>>> If You don't do this (and only then), all CJK fonts fall back to MS Mincho.
>>> Yes, it (the case without cidfmap) is what I care. I think "all CJK
>>> fonts fall back to MS Mincho" is worse than fallback to Helvetica,
>>> as shown by my 2 sample pictures. BTW, in your environment with cidfmap
>>> generated by ghostscript, my sample PDF (referring CJK CID-keyed fonts)
>>> is processed correctly?
>> We can't fall back to helvetica, if a CID font is expected. In this case
>> locateFont returns a NULL pointer!
> I think the caller of locateFont should prepare the case that
> no appropriate substituted font is found (if a NULL pointer
> is not appropriate to indicate such case, some error should be
> catched).
That'a what I did :-). Perhaps the error message is not clear, but that 
wasn't mine :-)
>
>> No, I've to insert additional lines in cidfmap, I attach it:
>> mkcifdmap.ps doesn't search for Pr-fonts. I add only lines for 4 fonts,
>> of course I could do it also for the others. I just want to show how
>> easy it is.
> Good to know. I was thinking current Ghostscript CJK font
> handling is not so intellectual so I expected that making
> Ghostscript some configuration data would not be an one-stop
> solution.
As I already mentioned, would be nice if You give us a better solution.

Thanks in advance,
Thomas
>
> Regards,
> mpsuzuki
>
>
>> I attach my cidfmap (be carefull, my windows home directory is
>> f:/windows). With these additional lines I got the attached result, and
>> these warnings:
>>
>> Syntax Error: Couldn't find a font for 'MS-PMincho', subst is 'MS-Mincho'
>> Syntax Error: Couldn't find a font for 'MS-Gothic', subst is 'MS-Mincho'
>> Syntax Error: Couldn't find a font for 'MS-PGothic', subst is 'MS-Mincho'mpsuzuki at hiroshima-u.ac.jp
>> Syntax Error: Couldn't find a font for 'MS-UIGothic', subst is 'MS-Mincho'
>> Syntax Error: Couldn't find a font for 'RyuminPr6-Light-Identity-H',
>> subst is 'ArialUnicodeMS-JP;'
>> Syntax Error: Couldn't find a font for 'RyuminPr6-Light-Identity-H'
>> Syntax Error: Couldn't find a font for 'RyuminPr6-Light-Identity-H',
>> subst is 'ArialUnicodeMS-JP;'
>> Syntax Error: Couldn't find a font for 'RyuminPr6-Light-Identity-H'
>> Syntax Error: Couldn't find a font for 'GothicBBBPr6-Medium-Identity-H',
>> subst is 'ArialUnicodeMS-JP'
>> Syntax Error: Couldn't find a font for 'HiraMinPro-W3-Identity-H', subst
>> is 'ArialUnicodeMS-JP'
>> Syntax Error: Couldn't find a font for 'HiraKakuStd-W3-Identity-H',
>> subst is 'ArialUnicodeMS-JP'
>>
>>>>> Thus, I'm afraid more efforts are needed for hardwired CID-keyed
>>>>> font fallback. At least, using MS-Mincho is not good idea, and,
>>>>> appropriate warning should be printed. Of course, I'm willing to
>>>>> work for this issue.
>>>> Isn't
>>>> error(-1, "Couldn't find a font for '%s', subst is '%s'",
>>>> fontName->getCString(), substFontName);
>>>>
>>>> an appropiate warning???
>>> I think it's slightly insufficient, substitution of CID-keyed font
>>> by non-CID-keyed is warned with more detail (Adobe-Japan1 font blah
>>> blah blah is substituted by non-CID-keyed blah blah blah).
>> You're not completely true: a non-CID-keyed font is still substituted by
>> Helvetica, only a CID-keyed font is replaced by MS Mincho. But if You
>> want another warning, just feel free to change the code.
>>
>> Cheers,
>> Thomas
>>>>> Thomas Freitag wrote:
>>>>>> Am 03.03.2012 17:40, schrieb suzuki toshiya:
>>>>>>> Hi,
>>>>>>>
>>>>>>> I'm quite sorry for that no CJK helpers involves this issue...
>>>>>>> The required help is a rewrite of your patch to fit the poppler
>>>>>>> coding convention, and for the maintainers working with Unix
>>>>>>> systems? If it is possible to do without Visual Studio, I will
>>>>>>> try.
>>>>>> Hopefully done now. As far as I rmember it was  Your patch (bug 11413) I
>>>>>> just applied to PSOutputDev.cc
>>>>>>> BTW, yet I've not checked your patch in detail, your patch is
>>>>>>> trying to convert all missing (non-embedded) CID-keyed CJK fonts
>>>>>>> by MS Mincho? I think it is not good idea for the users of
>>>>>>> Adobe-GB1 (PRC, Singapore), Adobe-CNS1 (Taiwan, HongKong),
>>>>>>> Adobe-Korea1 (ROK). I'm not sure if Ghostscript does so, but
>>>>>>> even if Ghostscript does so, poppler should not follow it.
>>>>>>> In fact, the coverage of CJK Ideographs are differently
>>>>>>> designed to fit to each markets.
>>>>>> No, it was not my goal to substitute all CID keyed fonts by MS Mincho.
>>>>>> The problem under Windows is just, that if there is no font with the
>>>>>> used name installed, poppler tried to replace it with Helvetica, but
>>>>>> because this is not a CID font no characters at all will be shown. So I
>>>>>> thought that MS Mincho is at least for this case a better idea as a
>>>>>> default CID font.
>>>>>> But if You copy the cidfmap produced by mkcidfm.ps from ghostscript in
>>>>>> the poppler data dir, that substitute font table will be used if (and
>>>>>> only if) the font is not embedded and not installed under windows. Hope
>>>>>> that fits for CJK users, I thought it was better to use an existing
>>>>>> substitution algorithm than do nothing. And as far as I understand
>>>>>> mkcidfm.ps it will also try to find suitable fonts for GB1, CNS1 and the
>>>>>> others, but I'm no expert for CJK fonts. The cidfmap I produced on my
>>>>>> system would use arialuni.ttf for all CJK fonts, but I have just the
>>>>>> Microsoft default fonts.
>>>>>>
>>>>>> Regards,
>>>>>> Thomas
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> poppler mailing list
>>>>>> poppler at lists.freedesktop.org
>>>>>> http://lists.freedesktop.org/mailman/listinfo/poppler
>>> .
>>>
>>
> _______________________________________________
> poppler mailing list
> poppler at lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/poppler
>
> .
>




More information about the poppler mailing list