[poppler] [PATCH] per-collection fallback for missing CID-keyed fonts on Win32

Albert Astals Cid aacid at kde.org
Thu Mar 29 10:34:56 PDT 2012


El Dijous, 29 de març de 2012, a les 20:34:01, suzuki toshiya va escriure:
> Dear Albert,
> 
> Quite sorry for confused status of this issue.
> Just I've checked my patch with the latest GIT
> 	631224dc0c721119c91984f1940c9e51edf17eca
> my patch is still applicable, and it works.
> So, please commit my patch, if you find no problem.

Pushed.

Albert

> 
> Regards,
> mpsuzuki
> 
> Albert Astals Cid wrote:
> > El Dimarts, 27 de març de 2012, a les 21:01:43, suzuki toshiya va 
escriure:
> >> Dear Albert,
> >> 
> >> I'm sorry for lated reply about your question. However, the
> >> latest patch by Thomas has already invalidated my patch,
> >> so please let me reconsider.
> > 
> > You mean you don't want this commited?
> > 
> > Albert
> > 
> >> Albert Astals Cid wrote:
> >>>> * Adobe-CNS1 (Taiwan) -> fallback to MingLiU
> >>>> * Adobe-GB1 (China mainland) -> fallback to SimSun
> >>>> * Adobe-Japan1 (Japan) -> fallback to MS-Mincho
> >>>> * Adobe-Japan2 (Japan) -> fallback to MS-Mincho
> >>>> * Adobe-Korea1 (Republic of Korea) -> fallback to Batang
> >>> 
> >>> Does windows ship with those fonts?
> >> 
> >> Yes, of course, at least, after Windows 2000.
> >> * MingLiU is available since Microsoft Windows 95 for Traditional
> >> Chinese,
> >> * SimSun is available since Microsoft Windows 2000 (at least).
> >> * MS-Mincho is available since Microsoft Windows 3.1 for Japanese,
> >> * Batang is available since Microsoft Windows 2000 (at least).
> >> 
> >> You may want to see a list showing which versions of Microsoft Windows
> >> (or which versions of Microsoft Office) ship which fonts. Me too, please
> >> give me more time to check. I checked
> >> 
> >> 	http://www.microsoft.com/typography/fonts/family.aspx ,
> >> 
> >> but it does not list the history before Windows 2000.
> >> 
> >> Regards,
> >> mpsuzuki
> >> 
> >> Albert Astals Cid wrote:
> >>> El Dimarts, 27 de març de 2012, a les 00:19:24,
> >>> mpsuzuki at hiroshima-u.ac.jp
> >>> va>
> >>> 
> >>> escriure:
> >>>> Hi all,
> >>> 
> >>> Hi
> >>> 
> >>>> Considering the forthcoming deadline for 0.20 feature
> >>>> freeze, here I propose a small patch as a first step
> >>>> to better fallback for missing CID-keyed CJK fonts.
> >>>> 
> >>>> As I discussed with Thomas, current poppler always
> >>>> tries to use a serif typeface for Japanese market
> >>>> (MS-Mincho), if the user does not make special font
> >>>> fallback definition table. Attached patch is a small
> >>>> enhancement of Thomas's work; it checks the collection
> >>>> of the missing CID-keyed font, and if it is known
> >>>> Adobe collection (Adobe-CNS1, -GB1, -Japan1, -Japan2,
> >>>> -Korea1), the fallback TrueType define for each collection
> >>>> is used.
> >>>> 
> >>>> * Adobe-CNS1 (Taiwan) -> fallback to MingLiU
> >>>> * Adobe-GB1 (China mainland) -> fallback to SimSun
> >>>> * Adobe-Japan1 (Japan) -> fallback to MS-Mincho
> >>>> * Adobe-Japan2 (Japan) -> fallback to MS-Mincho
> >>>> * Adobe-Korea1 (Republic of Korea) -> fallback to Batang
> >>> 
> >>> Does windows ship with those fonts?
> >>> 
> >>> Albert
> >>> 
> >>>> I'm working for further enhancement (I think missing
> >>>> Sans Serif CJK typeface should be fallbacked to another
> >>>> Sans Serif CJK typeface, as far as anything is available),
> >>>> but the investigation of historical typeface availability
> >>>> on Microsoft Windows would need some time.
> >>>> 
> >>>> Regards,
> >>>> mpsuzuki
> >>>> 
> >>>> 
> >>>> On Fri, 23 Mar 2012 22:28:27 +0100
> >>>> 
> >>>> Thomas Freitag <Thomas.Freitag at kabelmail.de> wrote:
> >>>>> Am 23.03.2012 14:06, schrieb mpsuzuki at hiroshima-u.ac.jp:
> >>>>>> On Fri, 23 Mar 2012 08:42:01 +0100
> >>>>>> 
> >>>>>> Thomas Freitag<Thomas.Freitag at kabelmail.de>  wrote:
> >>>>>>> Am 23.03.2012 08:21, schrieb suzuki toshiya:
> >>>>>>>> Excuse me, this is the 3rd issue in your first post (add a support
> >>>>>>>> to reflect cidfmap generated by ghostscript), that is not what I
> >>>>>>>> care.
> >>>>>>>> What I care is about hardwired MS-Mincho fallback.
> >>>>>>> 
> >>>>>>> It's just hard wired if a CID font is expected and no appropiate
> >>>>>>> substitute font is found. Propablay a better idea is to use
> >>>>>>> arialuni.ttf
> >>>>>>> instead of MS Mincho, but when I started with it, I only knew that
> >>>>>>> MS
> >>>>>>> Mincho is always installed and has some CJK chars.
> >>>>>> 
> >>>>>> One of the important problem in using single CJK font (e.g.
> >>>>>> MS Mincho) as generic fallback is that the coverage of the
> >>>>>> characters of CJK fonts are highly dependent with the assumed
> >>>>>> market.
> >>>>>> 
> >>>>>> For example, the fonts designed for China mainland, Taiwan
> >>>>>> and Japan are usually missing Hangul. They should not be
> >>>>>> used for Adobe-Korea1 fallback. In addition, the fonts
> >>>>>> designed for Japan, Taiwan, Korea are usually missing the
> >>>>>> simplified characters currently used in China mainland.
> >>>>>> Also, the latest version of Adobe-GB1 includes Yi script
> >>>>>> (U+A000 - U+A4BF), but the fonts for Taiwan, Japan and Korea
> >>>>>> are usually missing them.
> >>>>>> 
> >>>>>> Nothing to say, for Japanese customers, using MS Mincho or
> >>>>>> MS Gothic as generic fallback would be better than using
> >>>>>> SimSun (for China mainland), MingLiU (for Taiwan) or Batang
> >>>>>> (for Korea) as generic fallback, but it is unfair solution.
> >>>>>> 
> >>>>>> Using Arial Unicode as generic fallback would be neutral,
> >>>>>> although its typeface quality for CJK scripts is often
> >>>>>> disrespected. In addition, its vertical writing mode support
> >>>>>> is insufficient.
> >>>>>> 
> >>>>>> I attached 1 PDF and 2 pictures; one picture is fallbacked
> >>>>>> by MS Mincho, another picture is fallbacked to Arial Unicode.
> >>>>>> 
> >>>>>> Thus, I will propose a patch to prepare per-collection
> >>>>>> fallback fonts (for Adobe-CNS1, Adobe-GB1, Adobe-Japan1,
> >>>>>> Adobe-Japan2, Adobe-Korea1) and finally fallback to Arial
> >>>>>> Unicode when no appropriate one is found.
> >>>>> 
> >>>>> In my opinion: sounds great. I implemented my "poor" patch because I
> >>>>> sometimes debug problems with CJK fonts under Windows and see nothing.
> >>>>> So a better implementation is always welcome for me.
> >>>>> 
> >>>>>>>>> I'm not really sure where the poppler data dir ist expected on
> >>>>>>>>> MinGW,
> >>>>>>>>> should be /usr/local/share/poppler, otherwise You can patch the
> >>>>>>>>> code
> >>>>>>>>> where the GlobalParam constructor is called , I do it normally
> >>>>>>>>> under
> >>>>>>>>> windows:
> >>>>>>>>> 
> >>>>>>>>> globalParams = new
> >>>>>>>>> GlobalParams("E:\\Downloads\\poppler\\poppler-data-0.4.5");
> >>>>>>>>> 
> >>>>>>>>> and copy  cidfmap to that directory.
> >>>>>>>>> If You don't do this (and only then), all CJK fonts fall back to
> >>>>>>>>> MS
> >>>>>>>>> Mincho.
> >>>>>>>> 
> >>>>>>>> Yes, it (the case without cidfmap) is what I care. I think "all CJK
> >>>>>>>> fonts fall back to MS Mincho" is worse than fallback to Helvetica,
> >>>>>>>> as shown by my 2 sample pictures. BTW, in your environment with
> >>>>>>>> cidfmap
> >>>>>>>> generated by ghostscript, my sample PDF (referring CJK CID-keyed
> >>>>>>>> fonts)
> >>>>>>>> is processed correctly?
> >>>>>>> 
> >>>>>>> We can't fall back to helvetica, if a CID font is expected. In this
> >>>>>>> case
> >>>>>>> locateFont returns a NULL pointer!
> >>>>>> 
> >>>>>> I think the caller of locateFont should prepare the case that
> >>>>>> no appropriate substituted font is found (if a NULL pointer
> >>>>>> is not appropriate to indicate such case, some error should be
> >>>>>> catched).
> >>>>> 
> >>>>> That'a what I did :-). Perhaps the error message is not clear, but
> >>>>> that
> >>>>> wasn't mine :-)
> >>>>> 
> >>>>>>> No, I've to insert additional lines in cidfmap, I attach it:
> >>>>>>> mkcifdmap.ps doesn't search for Pr-fonts. I add only lines for 4
> >>>>>>> fonts,
> >>>>>>> of course I could do it also for the others. I just want to show how
> >>>>>>> easy it is.
> >>>>>> 
> >>>>>> Good to know. I was thinking current Ghostscript CJK font
> >>>>>> handling is not so intellectual so I expected that making
> >>>>>> Ghostscript some configuration data would not be an one-stop
> >>>>>> solution.
> >>>>> 
> >>>>> As I already mentioned, would be nice if You give us a better
> >>>>> solution.
> >>>>> 
> >>>>> Thanks in advance,
> >>>>> Thomas
> >>>>> 
> >>>>>> Regards,
> >>>>>> mpsuzuki
> >>>>>> 
> >>>>>>> I attach my cidfmap (be carefull, my windows home directory is
> >>>>>>> f:/windows). With these additional lines I got the attached result,
> >>>>>>> and
> >>>>>>> these warnings:
> >>>>>>> 
> >>>>>>> Syntax Error: Couldn't find a font for 'MS-PMincho', subst is
> >>>>>>> 'MS-Mincho'
> >>>>>>> Syntax Error: Couldn't find a font for 'MS-Gothic', subst is
> >>>>>>> 'MS-Mincho'
> >>>>>>> Syntax Error: Couldn't find a font for 'MS-PGothic', subst is
> >>>>>>> 'MS-Mincho'mpsuzuki at hiroshima-u.ac.jp Syntax Error: Couldn't find a
> >>>>>>> font for 'MS-UIGothic', subst is 'MS-Mincho' Syntax Error: Couldn't
> >>>>>>> find a font for 'RyuminPr6-Light-Identity-H', subst is
> >>>>>>> 'ArialUnicodeMS-JP;'
> >>>>>>> Syntax Error: Couldn't find a font for 'RyuminPr6-Light-Identity-H'
> >>>>>>> Syntax Error: Couldn't find a font for 'RyuminPr6-Light-Identity-H',
> >>>>>>> subst is 'ArialUnicodeMS-JP;'
> >>>>>>> Syntax Error: Couldn't find a font for 'RyuminPr6-Light-Identity-H'
> >>>>>>> Syntax Error: Couldn't find a font for
> >>>>>>> 'GothicBBBPr6-Medium-Identity-H',
> >>>>>>> subst is 'ArialUnicodeMS-JP'
> >>>>>>> Syntax Error: Couldn't find a font for 'HiraMinPro-W3-Identity-H',
> >>>>>>> subst
> >>>>>>> is 'ArialUnicodeMS-JP'
> >>>>>>> Syntax Error: Couldn't find a font for 'HiraKakuStd-W3-Identity-H',
> >>>>>>> subst is 'ArialUnicodeMS-JP'
> >>>>>>> 
> >>>>>>>>>> Thus, I'm afraid more efforts are needed for hardwired CID-keyed
> >>>>>>>>>> font fallback. At least, using MS-Mincho is not good idea, and,
> >>>>>>>>>> appropriate warning should be printed. Of course, I'm willing to
> >>>>>>>>>> work for this issue.
> >>>>>>>>> 
> >>>>>>>>> Isn't
> >>>>>>>>> error(-1, "Couldn't find a font for '%s', subst is '%s'",
> >>>>>>>>> fontName->getCString(), substFontName);
> >>>>>>>>> 
> >>>>>>>>> an appropiate warning???
> >>>>>>>> 
> >>>>>>>> I think it's slightly insufficient, substitution of CID-keyed font
> >>>>>>>> by non-CID-keyed is warned with more detail (Adobe-Japan1 font blah
> >>>>>>>> blah blah is substituted by non-CID-keyed blah blah blah).
> >>>>>>> 
> >>>>>>> You're not completely true: a non-CID-keyed font is still
> >>>>>>> substituted
> >>>>>>> by
> >>>>>>> Helvetica, only a CID-keyed font is replaced by MS Mincho. But if
> >>>>>>> You
> >>>>>>> want another warning, just feel free to change the code.
> >>>>>>> 
> >>>>>>> Cheers,
> >>>>>>> Thomas
> >>>>>>> 
> >>>>>>>>>> Thomas Freitag wrote:
> >>>>>>>>>>> Am 03.03.2012 17:40, schrieb suzuki toshiya:
> >>>>>>>>>>>> Hi,
> >>>>>>>>>>>> 
> >>>>>>>>>>>> I'm quite sorry for that no CJK helpers involves this issue...
> >>>>>>>>>>>> The required help is a rewrite of your patch to fit the poppler
> >>>>>>>>>>>> coding convention, and for the maintainers working with Unix
> >>>>>>>>>>>> systems? If it is possible to do without Visual Studio, I will
> >>>>>>>>>>>> try.
> >>>>>>>>>>> 
> >>>>>>>>>>> Hopefully done now. As far as I rmember it was  Your patch (bug
> >>>>>>>>>>> 11413) I
> >>>>>>>>>>> just applied to PSOutputDev.cc
> >>>>>>>>>>> 
> >>>>>>>>>>>> BTW, yet I've not checked your patch in detail, your patch is
> >>>>>>>>>>>> trying to convert all missing (non-embedded) CID-keyed CJK
> >>>>>>>>>>>> fonts
> >>>>>>>>>>>> by MS Mincho? I think it is not good idea for the users of
> >>>>>>>>>>>> Adobe-GB1 (PRC, Singapore), Adobe-CNS1 (Taiwan, HongKong),
> >>>>>>>>>>>> Adobe-Korea1 (ROK). I'm not sure if Ghostscript does so, but
> >>>>>>>>>>>> even if Ghostscript does so, poppler should not follow it.
> >>>>>>>>>>>> In fact, the coverage of CJK Ideographs are differently
> >>>>>>>>>>>> designed to fit to each markets.
> >>>>>>>>>>> 
> >>>>>>>>>>> No, it was not my goal to substitute all CID keyed fonts by MS
> >>>>>>>>>>> Mincho.
> >>>>>>>>>>> The problem under Windows is just, that if there is no font with
> >>>>>>>>>>> the
> >>>>>>>>>>> used name installed, poppler tried to replace it with Helvetica,
> >>>>>>>>>>> but
> >>>>>>>>>>> because this is not a CID font no characters at all will be
> >>>>>>>>>>> shown.
> >>>>>>>>>>> So I
> >>>>>>>>>>> thought that MS Mincho is at least for this case a better idea
> >>>>>>>>>>> as
> >>>>>>>>>>> a
> >>>>>>>>>>> default CID font.
> >>>>>>>>>>> But if You copy the cidfmap produced by mkcidfm.ps from
> >>>>>>>>>>> ghostscript
> >>>>>>>>>>> in
> >>>>>>>>>>> the poppler data dir, that substitute font table will be used if
> >>>>>>>>>>> (and
> >>>>>>>>>>> only if) the font is not embedded and not installed under
> >>>>>>>>>>> windows.
> >>>>>>>>>>> Hope
> >>>>>>>>>>> that fits for CJK users, I thought it was better to use an
> >>>>>>>>>>> existing
> >>>>>>>>>>> substitution algorithm than do nothing. And as far as I
> >>>>>>>>>>> understand
> >>>>>>>>>>> mkcidfm.ps it will also try to find suitable fonts for GB1, CNS1
> >>>>>>>>>>> and
> >>>>>>>>>>> the
> >>>>>>>>>>> others, but I'm no expert for CJK fonts. The cidfmap I produced
> >>>>>>>>>>> on
> >>>>>>>>>>> my
> >>>>>>>>>>> system would use arialuni.ttf for all CJK fonts, but I have just
> >>>>>>>>>>> the
> >>>>>>>>>>> Microsoft default fonts.
> >>>>>>>>>>> 
> >>>>>>>>>>> Regards,
> >>>>>>>>>>> Thomas
> >>>>>>>>>>> 
> >>>>>>>>>>> 
> >>>>>>>>>>> _______________________________________________
> >>>>>>>>>>> poppler mailing list
> >>>>>>>>>>> poppler at lists.freedesktop.org
> >>>>>>>>>>> http://lists.freedesktop.org/mailman/listinfo/poppler
> >>>>>>>> 
> >>>>>>>> .
> >>>>>> 
> >>>>>> _______________________________________________
> >>>>>> poppler mailing list
> >>>>>> poppler at lists.freedesktop.org
> >>>>>> http://lists.freedesktop.org/mailman/listinfo/poppler
> >>>>>> 
> >>>>>> .
> >>> 
> >>> _______________________________________________
> >>> poppler mailing list
> >>> poppler at lists.freedesktop.org
> >>> http://lists.freedesktop.org/mailman/listinfo/poppler
> >> 
> >> _______________________________________________
> >> poppler mailing list
> >> poppler at lists.freedesktop.org
> >> http://lists.freedesktop.org/mailman/listinfo/poppler
> > 
> > _______________________________________________
> > poppler mailing list
> > poppler at lists.freedesktop.org
> > http://lists.freedesktop.org/mailman/listinfo/poppler
> 
> _______________________________________________
> poppler mailing list
> poppler at lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/poppler


More information about the poppler mailing list