[poppler] Support for CJK fonts in postscript / on Windows platforms

mpsuzuki at hiroshima-u.ac.jp mpsuzuki at hiroshima-u.ac.jp
Fri Mar 23 06:06:37 PDT 2012


On Fri, 23 Mar 2012 08:42:01 +0100
Thomas Freitag <Thomas.Freitag at kabelmail.de> wrote:
>Am 23.03.2012 08:21, schrieb suzuki toshiya:
>> Excuse me, this is the 3rd issue in your first post (add a support
>> to reflect cidfmap generated by ghostscript), that is not what I care.
>> What I care is about hardwired MS-Mincho fallback.
>
>It's just hard wired if a CID font is expected and no appropiate
>substitute font is found. Propablay a better idea is to use arialuni.ttf
>instead of MS Mincho, but when I started with it, I only knew that MS
>Mincho is always installed and has some CJK chars.

One of the important problem in using single CJK font (e.g.
MS Mincho) as generic fallback is that the coverage of the
characters of CJK fonts are highly dependent with the assumed
market.

For example, the fonts designed for China mainland, Taiwan
and Japan are usually missing Hangul. They should not be
used for Adobe-Korea1 fallback. In addition, the fonts
designed for Japan, Taiwan, Korea are usually missing the
simplified characters currently used in China mainland.
Also, the latest version of Adobe-GB1 includes Yi script
(U+A000 - U+A4BF), but the fonts for Taiwan, Japan and Korea
are usually missing them.

Nothing to say, for Japanese customers, using MS Mincho or
MS Gothic as generic fallback would be better than using
SimSun (for China mainland), MingLiU (for Taiwan) or Batang
(for Korea) as generic fallback, but it is unfair solution.

Using Arial Unicode as generic fallback would be neutral,
although its typeface quality for CJK scripts is often
disrespected. In addition, its vertical writing mode support
is insufficient.

I attached 1 PDF and 2 pictures; one picture is fallbacked
by MS Mincho, another picture is fallbacked to Arial Unicode.

Thus, I will propose a patch to prepare per-collection
fallback fonts (for Adobe-CNS1, Adobe-GB1, Adobe-Japan1,
Adobe-Japan2, Adobe-Korea1) and finally fallback to Arial
Unicode when no appropriate one is found.

>>> I'm not really sure where the poppler data dir ist expected on MinGW, 
>>> should be /usr/local/share/poppler, otherwise You can patch the code 
>>> where the GlobalParam constructor is called , I do it normally under 
>>> windows:
>>>
>>> globalParams = new 
>>> GlobalParams("E:\\Downloads\\poppler\\poppler-data-0.4.5");
>>>
>>> and copy  cidfmap to that directory.
>>> If You don't do this (and only then), all CJK fonts fall back to MS Mincho.

>> Yes, it (the case without cidfmap) is what I care. I think "all CJK
>> fonts fall back to MS Mincho" is worse than fallback to Helvetica,
>> as shown by my 2 sample pictures. BTW, in your environment with cidfmap
>> generated by ghostscript, my sample PDF (referring CJK CID-keyed fonts)
>> is processed correctly?
>
>We can't fall back to helvetica, if a CID font is expected. In this case
>locateFont returns a NULL pointer!

I think the caller of locateFont should prepare the case that
no appropriate substituted font is found (if a NULL pointer
is not appropriate to indicate such case, some error should be
catched).

>No, I've to insert additional lines in cidfmap, I attach it:
>mkcifdmap.ps doesn't search for Pr-fonts. I add only lines for 4 fonts,
>of course I could do it also for the others. I just want to show how
>easy it is.

Good to know. I was thinking current Ghostscript CJK font
handling is not so intellectual so I expected that making
Ghostscript some configuration data would not be an one-stop
solution.

Regards,
mpsuzuki


>I attach my cidfmap (be carefull, my windows home directory is
>f:/windows). With these additional lines I got the attached result, and
>these warnings:
>
>Syntax Error: Couldn't find a font for 'MS-PMincho', subst is 'MS-Mincho'
>Syntax Error: Couldn't find a font for 'MS-Gothic', subst is 'MS-Mincho'
>Syntax Error: Couldn't find a font for 'MS-PGothic', subst is 'MS-Mincho'
>Syntax Error: Couldn't find a font for 'MS-UIGothic', subst is 'MS-Mincho'
>Syntax Error: Couldn't find a font for 'RyuminPr6-Light-Identity-H',
>subst is 'ArialUnicodeMS-JP;'
>Syntax Error: Couldn't find a font for 'RyuminPr6-Light-Identity-H'
>Syntax Error: Couldn't find a font for 'RyuminPr6-Light-Identity-H',
>subst is 'ArialUnicodeMS-JP;'
>Syntax Error: Couldn't find a font for 'RyuminPr6-Light-Identity-H'
>Syntax Error: Couldn't find a font for 'GothicBBBPr6-Medium-Identity-H',
>subst is 'ArialUnicodeMS-JP'
>Syntax Error: Couldn't find a font for 'HiraMinPro-W3-Identity-H', subst
>is 'ArialUnicodeMS-JP'
>Syntax Error: Couldn't find a font for 'HiraKakuStd-W3-Identity-H',
>subst is 'ArialUnicodeMS-JP'
>
>>
>>>> Thus, I'm afraid more efforts are needed for hardwired CID-keyed
>>>> font fallback. At least, using MS-Mincho is not good idea, and,
>>>> appropriate warning should be printed. Of course, I'm willing to
>>>> work for this issue.
>>> Isn't
>>> error(-1, "Couldn't find a font for '%s', subst is '%s'", 
>>> fontName->getCString(), substFontName);
>>>
>>> an appropiate warning???
>> I think it's slightly insufficient, substitution of CID-keyed font
>> by non-CID-keyed is warned with more detail (Adobe-Japan1 font blah
>> blah blah is substituted by non-CID-keyed blah blah blah).
>You're not completely true: a non-CID-keyed font is still substituted by
>Helvetica, only a CID-keyed font is replaced by MS Mincho. But if You
>want another warning, just feel free to change the code.
>
>Cheers,
>Thomas
>>
>>>> Thomas Freitag wrote:
>>>>> Am 03.03.2012 17:40, schrieb suzuki toshiya:
>>>>>> Hi,
>>>>>>
>>>>>> I'm quite sorry for that no CJK helpers involves this issue...
>>>>>> The required help is a rewrite of your patch to fit the poppler
>>>>>> coding convention, and for the maintainers working with Unix
>>>>>> systems? If it is possible to do without Visual Studio, I will
>>>>>> try.
>>>>> Hopefully done now. As far as I rmember it was  Your patch (bug 11413) I
>>>>> just applied to PSOutputDev.cc
>>>>>> BTW, yet I've not checked your patch in detail, your patch is
>>>>>> trying to convert all missing (non-embedded) CID-keyed CJK fonts
>>>>>> by MS Mincho? I think it is not good idea for the users of
>>>>>> Adobe-GB1 (PRC, Singapore), Adobe-CNS1 (Taiwan, HongKong),
>>>>>> Adobe-Korea1 (ROK). I'm not sure if Ghostscript does so, but
>>>>>> even if Ghostscript does so, poppler should not follow it.
>>>>>> In fact, the coverage of CJK Ideographs are differently
>>>>>> designed to fit to each markets.
>>>>> No, it was not my goal to substitute all CID keyed fonts by MS Mincho.
>>>>> The problem under Windows is just, that if there is no font with the
>>>>> used name installed, poppler tried to replace it with Helvetica, but
>>>>> because this is not a CID font no characters at all will be shown. So I
>>>>> thought that MS Mincho is at least for this case a better idea as a
>>>>> default CID font.
>>>>> But if You copy the cidfmap produced by mkcidfm.ps from ghostscript in
>>>>> the poppler data dir, that substitute font table will be used if (and
>>>>> only if) the font is not embedded and not installed under windows. Hope
>>>>> that fits for CJK users, I thought it was better to use an existing
>>>>> substitution algorithm than do nothing. And as far as I understand
>>>>> mkcidfm.ps it will also try to find suitable fonts for GB1, CNS1 and the
>>>>> others, but I'm no expert for CJK fonts. The cidfmap I produced on my
>>>>> system would use arialuni.ttf for all CJK fonts, but I have just the
>>>>> Microsoft default fonts.
>>>>>
>>>>> Regards,
>>>>> Thomas
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> poppler mailing list
>>>>> poppler at lists.freedesktop.org
>>>>> http://lists.freedesktop.org/mailman/listinfo/poppler
>>>
>>
>> .
>>
>
>


More information about the poppler mailing list