[poppler] Support for CJK fonts in postscript / on Windows platforms

Thomas Freitag Thomas.Freitag at kabelmail.de
Fri Mar 23 00:00:01 PDT 2012


Am 23.03.2012 06:08, schrieb suzuki toshiya:
> Hi Thomas,
>
> I've tested latest git HEAD including your patch substituting missing
> CID-keyed fonts to MS-Mincho, by MinGW binary. I'm sorry to say such,
> but I wonder what is the advantage.
>
> I'm usually using GNU/Linux and other Unix-like systems, and
> I'm unfamiliar how Win32 people configures&  uses poppler, so
> maybe I made some misunderstanding about your motivation. Please
> let me confirm about the background of MS-Mincho fallback.
>
> In your first post, you commented that substitution of unembedded
> CID-keyed by Helvetica causes a crash, so, the substitution by
> MS-Mincho (or other CJK fonts?) is expected.
> 	http://lists.freedesktop.org/archives/poppler/2012-February/008715.html
> The crashing problem is very critical, so even if the substitution
> by MS-Mincho cannot display the glyphs correctly, some fix is needed.
> I agree.
>
> But, for first, I could not reproduce the crashing problem by my
> binaries (pdffonts, pdftoppm and pdftops). Could you tell me more
> about your crashing problem (how to reproduce, sample PDF, etc)?
You cannot reproduce it anymore, I fixed that, too: In case that a CID 
font is expected but Helvetia is taken, the call of

fontLoc = font->locateFont(xref, gTrue);

returned a NULL which the causes a crash.
>
> Secondary, the substitution by MS-Mincho makes the problematic
> substitution unclear. I attached a PDF created by Microsoft Word
> and Adobe Acrobat; the smallest PDF that no fonts are embedded
> (I think the combination of Microsoft Word + Adobe Acrobat is
> the one of the most widely used workflow for PDF).
>
> Please compare attached 2 PNG pictures converted by pdftoppm with
> or without MS-Mincho fallback.
> * A PNG image created without MS-Mincho fallback show many squares
> for the glyphs that cannot be shown by Helvetica. It is easy for
> the users to find something goes wrong. Also, ASCII characters are
> shown correctly.
> * A PNG image created with MS-Mincho fallback show nothing at the
> positions where some glyphs are shown originally. It is difficult
> for the users to find something goes wrong. Also, no ASCII characters
> are shown.
As I explained, You should install ghostscript and then run

gswin32c -q -dBATCH -sFONTDIR=<windows font directory>  
-sCIDFMAP=<popper data dir>/cidfmap mkcidfm.ps

I'm not really sure where the poppler data dir ist expected on MinGW, 
should be /usr/local/share/poppler, otherwise You can patch the code 
where the GlobalParam constructor is called , I do it normally under 
windows:

globalParams = new 
GlobalParams("E:\\Downloads\\poppler\\poppler-data-0.4.5");

and copy  cidfmap to that directory.
If You don't do this (and only then), all CJK fonts fall back to MS Mincho.
>
> Thus, I'm afraid more efforts are needed for hardwired CID-keyed
> font fallback. At least, using MS-Mincho is not good idea, and,
> appropriate warning should be printed. Of course, I'm willing to
> work for this issue.
Isn't
error(-1, "Couldn't find a font for '%s', subst is '%s'", 
fontName->getCString(), substFontName);

an appropiate warning???

Regards,
Thomas
>
> Regards,
> mpsuzuki
>
>
> Thomas Freitag wrote:
>> Am 03.03.2012 17:40, schrieb suzuki toshiya:
>>> Hi,
>>>
>>> I'm quite sorry for that no CJK helpers involves this issue...
>>> The required help is a rewrite of your patch to fit the poppler
>>> coding convention, and for the maintainers working with Unix
>>> systems? If it is possible to do without Visual Studio, I will
>>> try.
>> Hopefully done now. As far as I rmember it was  Your patch (bug 11413) I
>> just applied to PSOutputDev.cc
>>> BTW, yet I've not checked your patch in detail, your patch is
>>> trying to convert all missing (non-embedded) CID-keyed CJK fonts
>>> by MS Mincho? I think it is not good idea for the users of
>>> Adobe-GB1 (PRC, Singapore), Adobe-CNS1 (Taiwan, HongKong),
>>> Adobe-Korea1 (ROK). I'm not sure if Ghostscript does so, but
>>> even if Ghostscript does so, poppler should not follow it.
>>> In fact, the coverage of CJK Ideographs are differently
>>> designed to fit to each markets.
>> No, it was not my goal to substitute all CID keyed fonts by MS Mincho.
>> The problem under Windows is just, that if there is no font with the
>> used name installed, poppler tried to replace it with Helvetica, but
>> because this is not a CID font no characters at all will be shown. So I
>> thought that MS Mincho is at least for this case a better idea as a
>> default CID font.
>> But if You copy the cidfmap produced by mkcidfm.ps from ghostscript in
>> the poppler data dir, that substitute font table will be used if (and
>> only if) the font is not embedded and not installed under windows. Hope
>> that fits for CJK users, I thought it was better to use an existing
>> substitution algorithm than do nothing. And as far as I understand
>> mkcidfm.ps it will also try to find suitable fonts for GB1, CNS1 and the
>> others, but I'm no expert for CJK fonts. The cidfmap I produced on my
>> system would use arialuni.ttf for all CJK fonts, but I have just the
>> Microsoft default fonts.
>>
>> Regards,
>> Thomas
>>
>>
>> _______________________________________________
>> poppler mailing list
>> poppler at lists.freedesktop.org
>> http://lists.freedesktop.org/mailman/listinfo/poppler

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.freedesktop.org/archives/poppler/attachments/20120323/ac60aa0a/attachment.html>


More information about the poppler mailing list