[poppler] [PATCH] per-collection fallback for missing CID-keyed fonts on Win32

Albert Astals Cid aacid at kde.org
Wed Mar 28 11:04:40 PDT 2012


El Dimarts, 27 de març de 2012, a les 21:01:43, suzuki toshiya va escriure:
> Dear Albert,
> 
> I'm sorry for lated reply about your question. However, the
> latest patch by Thomas has already invalidated my patch,
> so please let me reconsider.

You mean you don't want this commited?

Albert

> 
> Albert Astals Cid wrote:
> >> * Adobe-CNS1 (Taiwan) -> fallback to MingLiU
> >> * Adobe-GB1 (China mainland) -> fallback to SimSun
> >> * Adobe-Japan1 (Japan) -> fallback to MS-Mincho
> >> * Adobe-Japan2 (Japan) -> fallback to MS-Mincho
> >> * Adobe-Korea1 (Republic of Korea) -> fallback to Batang
> > 
> > Does windows ship with those fonts?
> 
> Yes, of course, at least, after Windows 2000.
> * MingLiU is available since Microsoft Windows 95 for Traditional Chinese,
> * SimSun is available since Microsoft Windows 2000 (at least).
> * MS-Mincho is available since Microsoft Windows 3.1 for Japanese,
> * Batang is available since Microsoft Windows 2000 (at least).
> 
> You may want to see a list showing which versions of Microsoft Windows
> (or which versions of Microsoft Office) ship which fonts. Me too, please
> give me more time to check. I checked
> 	http://www.microsoft.com/typography/fonts/family.aspx ,
> but it does not list the history before Windows 2000.
> 
> Regards,
> mpsuzuki
> 
> Albert Astals Cid wrote:
> > El Dimarts, 27 de març de 2012, a les 00:19:24, mpsuzuki at hiroshima-u.ac.jp
> > va> 
> > escriure:
> >> Hi all,
> > 
> > Hi
> > 
> >> Considering the forthcoming deadline for 0.20 feature
> >> freeze, here I propose a small patch as a first step
> >> to better fallback for missing CID-keyed CJK fonts.
> >> 
> >> As I discussed with Thomas, current poppler always
> >> tries to use a serif typeface for Japanese market
> >> (MS-Mincho), if the user does not make special font
> >> fallback definition table. Attached patch is a small
> >> enhancement of Thomas's work; it checks the collection
> >> of the missing CID-keyed font, and if it is known
> >> Adobe collection (Adobe-CNS1, -GB1, -Japan1, -Japan2,
> >> -Korea1), the fallback TrueType define for each collection
> >> is used.
> >> 
> >> * Adobe-CNS1 (Taiwan) -> fallback to MingLiU
> >> * Adobe-GB1 (China mainland) -> fallback to SimSun
> >> * Adobe-Japan1 (Japan) -> fallback to MS-Mincho
> >> * Adobe-Japan2 (Japan) -> fallback to MS-Mincho
> >> * Adobe-Korea1 (Republic of Korea) -> fallback to Batang
> > 
> > Does windows ship with those fonts?
> > 
> > Albert
> > 
> >> I'm working for further enhancement (I think missing
> >> Sans Serif CJK typeface should be fallbacked to another
> >> Sans Serif CJK typeface, as far as anything is available),
> >> but the investigation of historical typeface availability
> >> on Microsoft Windows would need some time.
> >> 
> >> Regards,
> >> mpsuzuki
> >> 
> >> 
> >> On Fri, 23 Mar 2012 22:28:27 +0100
> >> 
> >> Thomas Freitag <Thomas.Freitag at kabelmail.de> wrote:
> >>> Am 23.03.2012 14:06, schrieb mpsuzuki at hiroshima-u.ac.jp:
> >>>> On Fri, 23 Mar 2012 08:42:01 +0100
> >>>> 
> >>>> Thomas Freitag<Thomas.Freitag at kabelmail.de>  wrote:
> >>>>> Am 23.03.2012 08:21, schrieb suzuki toshiya:
> >>>>>> Excuse me, this is the 3rd issue in your first post (add a support
> >>>>>> to reflect cidfmap generated by ghostscript), that is not what I
> >>>>>> care.
> >>>>>> What I care is about hardwired MS-Mincho fallback.
> >>>>> 
> >>>>> It's just hard wired if a CID font is expected and no appropiate
> >>>>> substitute font is found. Propablay a better idea is to use
> >>>>> arialuni.ttf
> >>>>> instead of MS Mincho, but when I started with it, I only knew that MS
> >>>>> Mincho is always installed and has some CJK chars.
> >>>> 
> >>>> One of the important problem in using single CJK font (e.g.
> >>>> MS Mincho) as generic fallback is that the coverage of the
> >>>> characters of CJK fonts are highly dependent with the assumed
> >>>> market.
> >>>> 
> >>>> For example, the fonts designed for China mainland, Taiwan
> >>>> and Japan are usually missing Hangul. They should not be
> >>>> used for Adobe-Korea1 fallback. In addition, the fonts
> >>>> designed for Japan, Taiwan, Korea are usually missing the
> >>>> simplified characters currently used in China mainland.
> >>>> Also, the latest version of Adobe-GB1 includes Yi script
> >>>> (U+A000 - U+A4BF), but the fonts for Taiwan, Japan and Korea
> >>>> are usually missing them.
> >>>> 
> >>>> Nothing to say, for Japanese customers, using MS Mincho or
> >>>> MS Gothic as generic fallback would be better than using
> >>>> SimSun (for China mainland), MingLiU (for Taiwan) or Batang
> >>>> (for Korea) as generic fallback, but it is unfair solution.
> >>>> 
> >>>> Using Arial Unicode as generic fallback would be neutral,
> >>>> although its typeface quality for CJK scripts is often
> >>>> disrespected. In addition, its vertical writing mode support
> >>>> is insufficient.
> >>>> 
> >>>> I attached 1 PDF and 2 pictures; one picture is fallbacked
> >>>> by MS Mincho, another picture is fallbacked to Arial Unicode.
> >>>> 
> >>>> Thus, I will propose a patch to prepare per-collection
> >>>> fallback fonts (for Adobe-CNS1, Adobe-GB1, Adobe-Japan1,
> >>>> Adobe-Japan2, Adobe-Korea1) and finally fallback to Arial
> >>>> Unicode when no appropriate one is found.
> >>> 
> >>> In my opinion: sounds great. I implemented my "poor" patch because I
> >>> sometimes debug problems with CJK fonts under Windows and see nothing.
> >>> So a better implementation is always welcome for me.
> >>> 
> >>>>>>> I'm not really sure where the poppler data dir ist expected on
> >>>>>>> MinGW,
> >>>>>>> should be /usr/local/share/poppler, otherwise You can patch the code
> >>>>>>> where the GlobalParam constructor is called , I do it normally under
> >>>>>>> windows:
> >>>>>>> 
> >>>>>>> globalParams = new
> >>>>>>> GlobalParams("E:\\Downloads\\poppler\\poppler-data-0.4.5");
> >>>>>>> 
> >>>>>>> and copy  cidfmap to that directory.
> >>>>>>> If You don't do this (and only then), all CJK fonts fall back to MS
> >>>>>>> Mincho.
> >>>>>> 
> >>>>>> Yes, it (the case without cidfmap) is what I care. I think "all CJK
> >>>>>> fonts fall back to MS Mincho" is worse than fallback to Helvetica,
> >>>>>> as shown by my 2 sample pictures. BTW, in your environment with
> >>>>>> cidfmap
> >>>>>> generated by ghostscript, my sample PDF (referring CJK CID-keyed
> >>>>>> fonts)
> >>>>>> is processed correctly?
> >>>>> 
> >>>>> We can't fall back to helvetica, if a CID font is expected. In this
> >>>>> case
> >>>>> locateFont returns a NULL pointer!
> >>>> 
> >>>> I think the caller of locateFont should prepare the case that
> >>>> no appropriate substituted font is found (if a NULL pointer
> >>>> is not appropriate to indicate such case, some error should be
> >>>> catched).
> >>> 
> >>> That'a what I did :-). Perhaps the error message is not clear, but that
> >>> wasn't mine :-)
> >>> 
> >>>>> No, I've to insert additional lines in cidfmap, I attach it:
> >>>>> mkcifdmap.ps doesn't search for Pr-fonts. I add only lines for 4
> >>>>> fonts,
> >>>>> of course I could do it also for the others. I just want to show how
> >>>>> easy it is.
> >>>> 
> >>>> Good to know. I was thinking current Ghostscript CJK font
> >>>> handling is not so intellectual so I expected that making
> >>>> Ghostscript some configuration data would not be an one-stop
> >>>> solution.
> >>> 
> >>> As I already mentioned, would be nice if You give us a better solution.
> >>> 
> >>> Thanks in advance,
> >>> Thomas
> >>> 
> >>>> Regards,
> >>>> mpsuzuki
> >>>> 
> >>>>> I attach my cidfmap (be carefull, my windows home directory is
> >>>>> f:/windows). With these additional lines I got the attached result,
> >>>>> and
> >>>>> these warnings:
> >>>>> 
> >>>>> Syntax Error: Couldn't find a font for 'MS-PMincho', subst is
> >>>>> 'MS-Mincho'
> >>>>> Syntax Error: Couldn't find a font for 'MS-Gothic', subst is
> >>>>> 'MS-Mincho'
> >>>>> Syntax Error: Couldn't find a font for 'MS-PGothic', subst is
> >>>>> 'MS-Mincho'mpsuzuki at hiroshima-u.ac.jp Syntax Error: Couldn't find a
> >>>>> font for 'MS-UIGothic', subst is 'MS-Mincho' Syntax Error: Couldn't
> >>>>> find a font for 'RyuminPr6-Light-Identity-H', subst is
> >>>>> 'ArialUnicodeMS-JP;'
> >>>>> Syntax Error: Couldn't find a font for 'RyuminPr6-Light-Identity-H'
> >>>>> Syntax Error: Couldn't find a font for 'RyuminPr6-Light-Identity-H',
> >>>>> subst is 'ArialUnicodeMS-JP;'
> >>>>> Syntax Error: Couldn't find a font for 'RyuminPr6-Light-Identity-H'
> >>>>> Syntax Error: Couldn't find a font for
> >>>>> 'GothicBBBPr6-Medium-Identity-H',
> >>>>> subst is 'ArialUnicodeMS-JP'
> >>>>> Syntax Error: Couldn't find a font for 'HiraMinPro-W3-Identity-H',
> >>>>> subst
> >>>>> is 'ArialUnicodeMS-JP'
> >>>>> Syntax Error: Couldn't find a font for 'HiraKakuStd-W3-Identity-H',
> >>>>> subst is 'ArialUnicodeMS-JP'
> >>>>> 
> >>>>>>>> Thus, I'm afraid more efforts are needed for hardwired CID-keyed
> >>>>>>>> font fallback. At least, using MS-Mincho is not good idea, and,
> >>>>>>>> appropriate warning should be printed. Of course, I'm willing to
> >>>>>>>> work for this issue.
> >>>>>>> 
> >>>>>>> Isn't
> >>>>>>> error(-1, "Couldn't find a font for '%s', subst is '%s'",
> >>>>>>> fontName->getCString(), substFontName);
> >>>>>>> 
> >>>>>>> an appropiate warning???
> >>>>>> 
> >>>>>> I think it's slightly insufficient, substitution of CID-keyed font
> >>>>>> by non-CID-keyed is warned with more detail (Adobe-Japan1 font blah
> >>>>>> blah blah is substituted by non-CID-keyed blah blah blah).
> >>>>> 
> >>>>> You're not completely true: a non-CID-keyed font is still substituted
> >>>>> by
> >>>>> Helvetica, only a CID-keyed font is replaced by MS Mincho. But if You
> >>>>> want another warning, just feel free to change the code.
> >>>>> 
> >>>>> Cheers,
> >>>>> Thomas
> >>>>> 
> >>>>>>>> Thomas Freitag wrote:
> >>>>>>>>> Am 03.03.2012 17:40, schrieb suzuki toshiya:
> >>>>>>>>>> Hi,
> >>>>>>>>>> 
> >>>>>>>>>> I'm quite sorry for that no CJK helpers involves this issue...
> >>>>>>>>>> The required help is a rewrite of your patch to fit the poppler
> >>>>>>>>>> coding convention, and for the maintainers working with Unix
> >>>>>>>>>> systems? If it is possible to do without Visual Studio, I will
> >>>>>>>>>> try.
> >>>>>>>>> 
> >>>>>>>>> Hopefully done now. As far as I rmember it was  Your patch (bug
> >>>>>>>>> 11413) I
> >>>>>>>>> just applied to PSOutputDev.cc
> >>>>>>>>> 
> >>>>>>>>>> BTW, yet I've not checked your patch in detail, your patch is
> >>>>>>>>>> trying to convert all missing (non-embedded) CID-keyed CJK fonts
> >>>>>>>>>> by MS Mincho? I think it is not good idea for the users of
> >>>>>>>>>> Adobe-GB1 (PRC, Singapore), Adobe-CNS1 (Taiwan, HongKong),
> >>>>>>>>>> Adobe-Korea1 (ROK). I'm not sure if Ghostscript does so, but
> >>>>>>>>>> even if Ghostscript does so, poppler should not follow it.
> >>>>>>>>>> In fact, the coverage of CJK Ideographs are differently
> >>>>>>>>>> designed to fit to each markets.
> >>>>>>>>> 
> >>>>>>>>> No, it was not my goal to substitute all CID keyed fonts by MS
> >>>>>>>>> Mincho.
> >>>>>>>>> The problem under Windows is just, that if there is no font with
> >>>>>>>>> the
> >>>>>>>>> used name installed, poppler tried to replace it with Helvetica,
> >>>>>>>>> but
> >>>>>>>>> because this is not a CID font no characters at all will be shown.
> >>>>>>>>> So I
> >>>>>>>>> thought that MS Mincho is at least for this case a better idea as
> >>>>>>>>> a
> >>>>>>>>> default CID font.
> >>>>>>>>> But if You copy the cidfmap produced by mkcidfm.ps from
> >>>>>>>>> ghostscript
> >>>>>>>>> in
> >>>>>>>>> the poppler data dir, that substitute font table will be used if
> >>>>>>>>> (and
> >>>>>>>>> only if) the font is not embedded and not installed under windows.
> >>>>>>>>> Hope
> >>>>>>>>> that fits for CJK users, I thought it was better to use an
> >>>>>>>>> existing
> >>>>>>>>> substitution algorithm than do nothing. And as far as I understand
> >>>>>>>>> mkcidfm.ps it will also try to find suitable fonts for GB1, CNS1
> >>>>>>>>> and
> >>>>>>>>> the
> >>>>>>>>> others, but I'm no expert for CJK fonts. The cidfmap I produced on
> >>>>>>>>> my
> >>>>>>>>> system would use arialuni.ttf for all CJK fonts, but I have just
> >>>>>>>>> the
> >>>>>>>>> Microsoft default fonts.
> >>>>>>>>> 
> >>>>>>>>> Regards,
> >>>>>>>>> Thomas
> >>>>>>>>> 
> >>>>>>>>> 
> >>>>>>>>> _______________________________________________
> >>>>>>>>> poppler mailing list
> >>>>>>>>> poppler at lists.freedesktop.org
> >>>>>>>>> http://lists.freedesktop.org/mailman/listinfo/poppler
> >>>>>> 
> >>>>>> .
> >>>> 
> >>>> _______________________________________________
> >>>> poppler mailing list
> >>>> poppler at lists.freedesktop.org
> >>>> http://lists.freedesktop.org/mailman/listinfo/poppler
> >>>> 
> >>>> .
> > 
> > _______________________________________________
> > poppler mailing list
> > poppler at lists.freedesktop.org
> > http://lists.freedesktop.org/mailman/listinfo/poppler
> 
> _______________________________________________
> poppler mailing list
> poppler at lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/poppler


More information about the poppler mailing list