[poppler] [PATCH] per-collection fallback for missing CID-keyed fonts on Win32

Thomas Freitag Thomas.Freitag at kabelmail.de
Sun Apr 1 02:32:43 PDT 2012


Am 28.03.2012 20:04, schrieb Albert Astals Cid:
> El Dimarts, 27 de març de 2012, a les 21:01:43, suzuki toshiya va escriure:
>> Dear Albert,
>>
>> I'm sorry for lated reply about your question. However, the
>> latest patch by Thomas has already invalidated my patch,
>> so please let me reconsider.
> You mean you don't want this commited?
>
> Albert
>
>> Albert Astals Cid wrote:
>>>> * Adobe-CNS1 (Taiwan) ->  fallback to MingLiU
>>>> * Adobe-GB1 (China mainland) ->  fallback to SimSun
>>>> * Adobe-Japan1 (Japan) ->  fallback to MS-Mincho
>>>> * Adobe-Japan2 (Japan) ->  fallback to MS-Mincho
>>>> * Adobe-Korea1 (Republic of Korea) ->  fallback to Batang
>>> Does windows ship with those fonts?
>> Yes, of course, at least, after Windows 2000.
>> * MingLiU is available since Microsoft Windows 95 for Traditional Chinese,
>> * SimSun is available since Microsoft Windows 2000 (at least).
>> * MS-Mincho is available since Microsoft Windows 3.1 for Japanese,
>> * Batang is available since Microsoft Windows 2000 (at least).
>>
>> You may want to see a list showing which versions of Microsoft Windows
>> (or which versions of Microsoft Office) ship which fonts. Me too, please
>> give me more time to check. I checked
>> 	http://www.microsoft.com/typography/fonts/family.aspx ,
>> but it does not list the history before Windows 2000.
I'm still working with Windows XP, and Your're true: when I'm looking at 
the link, click on "Find fonts" and select Windows XP, the fonts should 
be bundled. But if I run pdftoppm, it says:

Syntax Error: No display font for 'MingLiU'
Syntax Error: No display font for 'SimSun'
Syntax Error: No display font for 'MS-Mincho'
Syntax Error: No display font for 'Batang'

SimSun, MingLiU and Batang are really not in my Windows font directory, 
MS-Minchu is not find because it's extension is ".ttf" and NOT ".ttc". 
Would it be an idea to ignore not exsiting CJK-fonts and fall back to 
ArialUnicode in this case?

Sorry for the late test, was very busy in the last days,
Thomas
>>
>> Regards,
>> mpsuzuki
>>
>> Albert Astals Cid wrote:
>>> El Dimarts, 27 de març de 2012, a les 00:19:24, mpsuzuki at hiroshima-u.ac.jp
>>> va>
>>> escriure:
>>>> Hi all,
>>> Hi
>>>
>>>> Considering the forthcoming deadline for 0.20 feature
>>>> freeze, here I propose a small patch as a first step
>>>> to better fallback for missing CID-keyed CJK fonts.
>>>>
>>>> As I discussed with Thomas, current poppler always
>>>> tries to use a serif typeface for Japanese market
>>>> (MS-Mincho), if the user does not make special font
>>>> fallback definition table. Attached patch is a small
>>>> enhancement of Thomas's work; it checks the collection
>>>> of the missing CID-keyed font, and if it is known
>>>> Adobe collection (Adobe-CNS1, -GB1, -Japan1, -Japan2,
>>>> -Korea1), the fallback TrueType define for each collection
>>>> is used.
>>>>
>>>> * Adobe-CNS1 (Taiwan) ->  fallback to MingLiU
>>>> * Adobe-GB1 (China mainland) ->  fallback to SimSun
>>>> * Adobe-Japan1 (Japan) ->  fallback to MS-Mincho
>>>> * Adobe-Japan2 (Japan) ->  fallback to MS-Mincho
>>>> * Adobe-Korea1 (Republic of Korea) ->  fallback to Batang
>>> Does windows ship with those fonts?
>>>
>>> Albert
>>>
>>>> I'm working for further enhancement (I think missing
>>>> Sans Serif CJK typeface should be fallbacked to another
>>>> Sans Serif CJK typeface, as far as anything is available),
>>>> but the investigation of historical typeface availability
>>>> on Microsoft Windows would need some time.
>>>>
>>>> Regards,
>>>> mpsuzuki
>>>>
>>>>
>>>> On Fri, 23 Mar 2012 22:28:27 +0100
>>>>
>>>> Thomas Freitag<Thomas.Freitag at kabelmail.de>  wrote:
>>>>> Am 23.03.2012 14:06, schrieb mpsuzuki at hiroshima-u.ac.jp:
>>>>>> On Fri, 23 Mar 2012 08:42:01 +0100
>>>>>>
>>>>>> Thomas Freitag<Thomas.Freitag at kabelmail.de>   wrote:
>>>>>>> Am 23.03.2012 08:21, schrieb suzuki toshiya:
>>>>>>>> Excuse me, this is the 3rd issue in your first post (add a support
>>>>>>>> to reflect cidfmap generated by ghostscript), that is not what I
>>>>>>>> care.
>>>>>>>> What I care is about hardwired MS-Mincho fallback.
>>>>>>> It's just hard wired if a CID font is expected and no appropiate
>>>>>>> substitute font is found. Propablay a better idea is to use
>>>>>>> arialuni.ttf
>>>>>>> instead of MS Mincho, but when I started with it, I only knew that MS
>>>>>>> Mincho is always installed and has some CJK chars.
>>>>>> One of the important problem in using single CJK font (e.g.
>>>>>> MS Mincho) as generic fallback is that the coverage of the
>>>>>> characters of CJK fonts are highly dependent with the assumed
>>>>>> market.
>>>>>>
>>>>>> For example, the fonts designed for China mainland, Taiwan
>>>>>> and Japan are usually missing Hangul. They should not be
>>>>>> used for Adobe-Korea1 fallback. In addition, the fonts
>>>>>> designed for Japan, Taiwan, Korea are usually missing the
>>>>>> simplified characters currently used in China mainland.
>>>>>> Also, the latest version of Adobe-GB1 includes Yi script
>>>>>> (U+A000 - U+A4BF), but the fonts for Taiwan, Japan and Korea
>>>>>> are usually missing them.
>>>>>>
>>>>>> Nothing to say, for Japanese customers, using MS Mincho or
>>>>>> MS Gothic as generic fallback would be better than using
>>>>>> SimSun (for China mainland), MingLiU (for Taiwan) or Batang
>>>>>> (for Korea) as generic fallback, but it is unfair solution.
>>>>>>
>>>>>> Using Arial Unicode as generic fallback would be neutral,
>>>>>> although its typeface quality for CJK scripts is often
>>>>>> disrespected. In addition, its vertical writing mode support
>>>>>> is insufficient.
>>>>>>
>>>>>> I attached 1 PDF and 2 pictures; one picture is fallbacked
>>>>>> by MS Mincho, another picture is fallbacked to Arial Unicode.
>>>>>>
>>>>>> Thus, I will propose a patch to prepare per-collection
>>>>>> fallback fonts (for Adobe-CNS1, Adobe-GB1, Adobe-Japan1,
>>>>>> Adobe-Japan2, Adobe-Korea1) and finally fallback to Arial
>>>>>> Unicode when no appropriate one is found.
>>>>> In my opinion: sounds great. I implemented my "poor" patch because I
>>>>> sometimes debug problems with CJK fonts under Windows and see nothing.
>>>>> So a better implementation is always welcome for me.
>>>>>
>>>>>>>>> I'm not really sure where the poppler data dir ist expected on
>>>>>>>>> MinGW,
>>>>>>>>> should be /usr/local/share/poppler, otherwise You can patch the code
>>>>>>>>> where the GlobalParam constructor is called , I do it normally under
>>>>>>>>> windows:
>>>>>>>>>
>>>>>>>>> globalParams = new
>>>>>>>>> GlobalParams("E:\\Downloads\\poppler\\poppler-data-0.4.5");
>>>>>>>>>
>>>>>>>>> and copy  cidfmap to that directory.
>>>>>>>>> If You don't do this (and only then), all CJK fonts fall back to MS
>>>>>>>>> Mincho.
>>>>>>>> Yes, it (the case without cidfmap) is what I care. I think "all CJK
>>>>>>>> fonts fall back to MS Mincho" is worse than fallback to Helvetica,
>>>>>>>> as shown by my 2 sample pictures. BTW, in your environment with
>>>>>>>> cidfmap
>>>>>>>> generated by ghostscript, my sample PDF (referring CJK CID-keyed
>>>>>>>> fonts)
>>>>>>>> is processed correctly?
>>>>>>> We can't fall back to helvetica, if a CID font is expected. In this
>>>>>>> case
>>>>>>> locateFont returns a NULL pointer!
>>>>>> I think the caller of locateFont should prepare the case that
>>>>>> no appropriate substituted font is found (if a NULL pointer
>>>>>> is not appropriate to indicate such case, some error should be
>>>>>> catched).
>>>>> That'a what I did :-). Perhaps the error message is not clear, but that
>>>>> wasn't mine :-)
>>>>>
>>>>>>> No, I've to insert additional lines in cidfmap, I attach it:
>>>>>>> mkcifdmap.ps doesn't search for Pr-fonts. I add only lines for 4
>>>>>>> fonts,
>>>>>>> of course I could do it also for the others. I just want to show how
>>>>>>> easy it is.
>>>>>> Good to know. I was thinking current Ghostscript CJK font
>>>>>> handling is not so intellectual so I expected that making
>>>>>> Ghostscript some configuration data would not be an one-stop
>>>>>> solution.
>>>>> As I already mentioned, would be nice if You give us a better solution.
>>>>>
>>>>> Thanks in advance,
>>>>> Thomas
>>>>>
>>>>>> Regards,
>>>>>> mpsuzuki
>>>>>>
>>>>>>> I attach my cidfmap (be carefull, my windows home directory is
>>>>>>> f:/windows). With these additional lines I got the attached result,
>>>>>>> and
>>>>>>> these warnings:
>>>>>>>
>>>>>>> Syntax Error: Couldn't find a font for 'MS-PMincho', subst is
>>>>>>> 'MS-Mincho'
>>>>>>> Syntax Error: Couldn't find a font for 'MS-Gothic', subst is
>>>>>>> 'MS-Mincho'
>>>>>>> Syntax Error: Couldn't find a font for 'MS-PGothic', subst is
>>>>>>> 'MS-Mincho'mpsuzuki at hiroshima-u.ac.jp Syntax Error: Couldn't find a
>>>>>>> font for 'MS-UIGothic', subst is 'MS-Mincho' Syntax Error: Couldn't
>>>>>>> find a font for 'RyuminPr6-Light-Identity-H', subst is
>>>>>>> 'ArialUnicodeMS-JP;'
>>>>>>> Syntax Error: Couldn't find a font for 'RyuminPr6-Light-Identity-H'
>>>>>>> Syntax Error: Couldn't find a font for 'RyuminPr6-Light-Identity-H',
>>>>>>> subst is 'ArialUnicodeMS-JP;'
>>>>>>> Syntax Error: Couldn't find a font for 'RyuminPr6-Light-Identity-H'
>>>>>>> Syntax Error: Couldn't find a font for
>>>>>>> 'GothicBBBPr6-Medium-Identity-H',
>>>>>>> subst is 'ArialUnicodeMS-JP'
>>>>>>> Syntax Error: Couldn't find a font for 'HiraMinPro-W3-Identity-H',
>>>>>>> subst
>>>>>>> is 'ArialUnicodeMS-JP'
>>>>>>> Syntax Error: Couldn't find a font for 'HiraKakuStd-W3-Identity-H',
>>>>>>> subst is 'ArialUnicodeMS-JP'
>>>>>>>
>>>>>>>>>> Thus, I'm afraid more efforts are needed for hardwired CID-keyed
>>>>>>>>>> font fallback. At least, using MS-Mincho is not good idea, and,
>>>>>>>>>> appropriate warning should be printed. Of course, I'm willing to
>>>>>>>>>> work for this issue.
>>>>>>>>> Isn't
>>>>>>>>> error(-1, "Couldn't find a font for '%s', subst is '%s'",
>>>>>>>>> fontName->getCString(), substFontName);
>>>>>>>>>
>>>>>>>>> an appropiate warning???
>>>>>>>> I think it's slightly insufficient, substitution of CID-keyed font
>>>>>>>> by non-CID-keyed is warned with more detail (Adobe-Japan1 font blah
>>>>>>>> blah blah is substituted by non-CID-keyed blah blah blah).
>>>>>>> You're not completely true: a non-CID-keyed font is still substituted
>>>>>>> by
>>>>>>> Helvetica, only a CID-keyed font is replaced by MS Mincho. But if You
>>>>>>> want another warning, just feel free to change the code.
>>>>>>>
>>>>>>> Cheers,
>>>>>>> Thomas
>>>>>>>
>>>>>>>>>> Thomas Freitag wrote:
>>>>>>>>>>> Am 03.03.2012 17:40, schrieb suzuki toshiya:
>>>>>>>>>>>> Hi,
>>>>>>>>>>>>
>>>>>>>>>>>> I'm quite sorry for that no CJK helpers involves this issue....
>>>>>>>>>>>> The required help is a rewrite of your patch to fit the poppler
>>>>>>>>>>>> coding convention, and for the maintainers working with Unix
>>>>>>>>>>>> systems? If it is possible to do without Visual Studio, I will
>>>>>>>>>>>> try.
>>>>>>>>>>> Hopefully done now. As far as I rmember it was  Your patch (bug
>>>>>>>>>>> 11413) I
>>>>>>>>>>> just applied to PSOutputDev.cc
>>>>>>>>>>>
>>>>>>>>>>>> BTW, yet I've not checked your patch in detail, your patch is
>>>>>>>>>>>> trying to convert all missing (non-embedded) CID-keyed CJK fonts
>>>>>>>>>>>> by MS Mincho? I think it is not good idea for the users of
>>>>>>>>>>>> Adobe-GB1 (PRC, Singapore), Adobe-CNS1 (Taiwan, HongKong),
>>>>>>>>>>>> Adobe-Korea1 (ROK). I'm not sure if Ghostscript does so, but
>>>>>>>>>>>> even if Ghostscript does so, poppler should not follow it.
>>>>>>>>>>>> In fact, the coverage of CJK Ideographs are differently
>>>>>>>>>>>> designed to fit to each markets.
>>>>>>>>>>> No, it was not my goal to substitute all CID keyed fonts by MS
>>>>>>>>>>> Mincho.
>>>>>>>>>>> The problem under Windows is just, that if there is no font with
>>>>>>>>>>> the
>>>>>>>>>>> used name installed, poppler tried to replace it with Helvetica,
>>>>>>>>>>> but
>>>>>>>>>>> because this is not a CID font no characters at all will be shown.
>>>>>>>>>>> So I
>>>>>>>>>>> thought that MS Mincho is at least for this case a better idea as
>>>>>>>>>>> a
>>>>>>>>>>> default CID font.
>>>>>>>>>>> But if You copy the cidfmap produced by mkcidfm.ps from
>>>>>>>>>>> ghostscript
>>>>>>>>>>> in
>>>>>>>>>>> the poppler data dir, that substitute font table will be used if
>>>>>>>>>>> (and
>>>>>>>>>>> only if) the font is not embedded and not installed under windows.
>>>>>>>>>>> Hope
>>>>>>>>>>> that fits for CJK users, I thought it was better to use an
>>>>>>>>>>> existing
>>>>>>>>>>> substitution algorithm than do nothing. And as far as I understand
>>>>>>>>>>> mkcidfm.ps it will also try to find suitable fonts for GB1, CNS1
>>>>>>>>>>> and
>>>>>>>>>>> the
>>>>>>>>>>> others, but I'm no expert for CJK fonts. The cidfmap I produced on
>>>>>>>>>>> my
>>>>>>>>>>> system would use arialuni.ttf for all CJK fonts, but I have just
>>>>>>>>>>> the
>>>>>>>>>>> Microsoft default fonts.
>>>>>>>>>>>
>>>>>>>>>>> Regards,
>>>>>>>>>>> Thomas
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> _______________________________________________
>>>>>>>>>>> poppler mailing list
>>>>>>>>>>> poppler at lists.freedesktop.org
>>>>>>>>>>> http://lists.freedesktop.org/mailman/listinfo/poppler
>>>>>>>> .
>>>>>> _______________________________________________
>>>>>> poppler mailing list
>>>>>> poppler at lists.freedesktop.org
>>>>>> http://lists.freedesktop.org/mailman/listinfo/poppler
>>>>>>
>>>>>> .
>>> _______________________________________________
>>> poppler mailing list
>>> poppler at lists.freedesktop.org
>>> http://lists.freedesktop.org/mailman/listinfo/poppler
>> _______________________________________________
>> poppler mailing list
>> poppler at lists.freedesktop.org
>> http://lists.freedesktop.org/mailman/listinfo/poppler
> _______________________________________________
> poppler mailing list
> poppler at lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/poppler
>
> .
>




More information about the poppler mailing list