[poppler] How to recognize the Japan Font.

Zhong, Steven Steven.Zhong at fil.com
Wed May 29 06:10:50 UTC 2019


Hi Suzuki-san,

We have install the latest poppler and poppler-data.  But the result is the same.

By the way , we can't copy the content correctly from the PDF on win 10 through Ctrl + C , Ctrl +V.     Thanks

root at 08a02db0d267:/home/vcap/app/pop/bin# ./pdfinfo -v
pdfinfo version 0.77.0
Copyright 2005-2019 The Poppler Developers - http://poppler.freedesktop.org
Copyright 1996-2011 Glyph & Cog, LLC
root at 08a02db0d267:/home/vcap/app/pop/bin#



head: cannot open '10' for reading: No such file or directory
==> sss <==
㻝㻛㻥

䚷

タᐃ᪥䠖㻞㻜㻝㻡ᖺ㻝㻞᭶㻣᪥
ಙクᮇ㛫䠖㻞㻜㻝㻡ᖺ㻝㻞᭶㻣᪥䛛䜙㻞㻜㻟㻝ᖺ㻥᭶㻞㻡᪥䜎䛷
Ỵ⟬᪥䠖ཎ๎䛸䛧䛶ẖᖺ㻥᭶㻞㻡᪥䠄ఇᴗ᪥䛾ሙྜ䛿⩣Ⴀᴗ᪥䠅
䈜ᙜヱᐇ⦼䛿㐣ཤ䛾䜒䛾䛷䛒䜚䚸ᑗ᮶䛾㐠⏝ᡂᯝ➼䜢ಖド䛩䜛䜒䛾䛷䛿䛒䜚䜎䛫䜣䚹

䕔ᇶ‽౯㢠䞉⣧㈨⏘⥲㢠䛾᥎⛣
root at 08a02db0d267:/home/vcap/app/pop/bin#


-----Original Message-----
From: suzuki toshiya <mpsuzuki at hiroshima-u.ac.jp> 
Sent: 2019年5月29日 12:25
To: 'poppler at lists.freedesktop.org' <poppler at lists.freedesktop.org>
Cc: Leonard Rosenthol <lrosenth at adobe.com>; Zhong, Steven <Steven.Zhong at fil.com>
Subject: Re: [poppler] How to recognize the Japan Font.

Hi Zhong,

As Leonard pointed, the fonts are embedded in the document. My comments are 3 points.

* maybe you should install poppler-data package including the mapping tables from Adobe CID (please google or baidu to understand what it is) to character encoding.
* but your poppler 0.62.0 might be too old to find matching poppler-data package.
* I suggest to upgrade poppler and install poppler-data.

Regards,
mpsuzuki

On 2019/05/29 13:01, Leonard Rosenthol wrote:
> The font is embedded in the PDF – but that is only for the purposes of rendering.
> [cid:image001.png at 01D51616.3E8E4360]
> 
> Leonard
> 
> From: poppler <poppler-bounces at lists.freedesktop.org> on behalf of 
> "Zhong, Steven" <Steven.Zhong at fil.com>
> Date: Wednesday, May 29, 2019 at 11:58 AM
> To: "poppler at lists.freedesktop.org" <poppler at lists.freedesktop.org>
> Subject: [poppler] How to recognize the Japan Font.
> 
> Hi All,
> 
> I want to convert the PDF that you can refer the link 
> https://urldefense.proofpoint.com/v2/url?u=https-3A__www.fidelity.jp_s
> tatic_pdf_fund_5111893-2DFD30BA_Reports_Monthly_FD30BA-2DMF-2D201904.p
> df&d=DwIFaQ&c=SsZxQMfaWJ1sSVfloc5FVGba8BA_qR4Jzdt8ol2oSPA&r=tyXS-3xv16
> eg2LZ2DjciLqO6MNuEh4qjVsbZJ_K528M&m=_RhRce5ysnSgbiIYDiT8YGyVac5MdwtW2Q
> AH434ax9Q&s=AAqXcYzH07HTqKJ-c6oM8j4kWBfgxzKIVxD65Hu328Y&e=<https://url
> defense.proofpoint.com/v2/url?u=https-3A__jpn01.safelinks.protection.o
> utlook.com_-3Furl-3Dhttps-253A-252F-252Fwww.fidelity.jp-252Fstatic-252
> Fpdf-252Ffund-252F5111893-2DFD30BA-252FReports-252FMonthly-252FFD30BA-
> 2DMF-2D201904.pdf-26data-3D02-257C01-257Cmpsuzuki-2540hiroshima-2Du.ac
> .jp-257C16e02f2420ec400edcd408d6e3ea576e-257Cc40454ddb2634926868d8e126
> 40d3750-257C1-257C0-257C636946992969364339-26sdata-3DCa0Lhw6vFQtBt7u5O
> mscsZlbFzTfkQC0rQAQASsgCNo-253D-26reserved-3D0&d=DwIFaQ&c=SsZxQMfaWJ1s
> SVfloc5FVGba8BA_qR4Jzdt8ol2oSPA&r=tyXS-3xv16eg2LZ2DjciLqO6MNuEh4qjVsbZ
> J_K528M&m=_RhRce5ysnSgbiIYDiT8YGyVac5MdwtW2QAH434ax9Q&s=YVKkBYgdhgptuA
> LB6Prm09Um2ul5LdCAlSEPijOtTNo&e=>
> 
> But cant read it correctly ,  I find the Font is 
> MS-PGothic-90ms-RKSJ-H Encoding is Identify-H
> 
> Convert to txt is like below.        I guess it is font missing.    How to install the font and to read it currently.     Many Thanks
> ᅜෆ⥲⏕⏘䠄㻳㻰㻼䠅ᡂ㛗⋡䛜๓ᅄ༙ᮇ䛸ྠỈ‽䛻䛺䜚䚸୰ᅜᬒẼ䛻ᗏධ䜜ឤ䛜ฟጞ䜑䛯䛣䛸䜒㈙䛔Ᏻᚰឤ䛻䛴䛺
> 
> 
> vcap at e0779423-b47e-499c-4c1b-4ecd:~/app/pop/bin$ ./pdfinfo -v pdfinfo 
> version 0.62.0 Copyright 2005-2017 The Poppler Developers - 
> https://urldefense.proofpoint.com/v2/url?u=http-3A__poppler.freedeskto
> p.org&d=DwIFaQ&c=SsZxQMfaWJ1sSVfloc5FVGba8BA_qR4Jzdt8ol2oSPA&r=tyXS-3x
> v16eg2LZ2DjciLqO6MNuEh4qjVsbZJ_K528M&m=_RhRce5ysnSgbiIYDiT8YGyVac5Mdwt
> W2QAH434ax9Q&s=rMRVesSKrqPMQNmKpZ9oOO2FhiZY5fDFo4xJVQl34gs&e=<https://
> urldefense.proofpoint.com/v2/url?u=https-3A__jpn01.safelinks.protectio
> n.outlook.com_-3Furl-3Dhttp-253A-252F-252Fpoppler.freedesktop.org-26da
> ta-3D02-257C01-257Cmpsuzuki-2540hiroshima-2Du.ac.jp-257C16e02f2420ec40
> 0edcd408d6e3ea576e-257Cc40454ddb2634926868d8e12640d3750-257C1-257C0-25
> 7C636946992969374325-26sdata-3Dxq-252FKaib2f9WujNOEGxTm-252FtQoWlyAd0d
> -252BIvFAxWMM8yw-253D-26reserved-3D0&d=DwIFaQ&c=SsZxQMfaWJ1sSVfloc5FVG
> ba8BA_qR4Jzdt8ol2oSPA&r=tyXS-3xv16eg2LZ2DjciLqO6MNuEh4qjVsbZJ_K528M&m=
> _RhRce5ysnSgbiIYDiT8YGyVac5MdwtW2QAH434ax9Q&s=qPUF-7sEtuuD4I6Z9atZnYM4
> WK-1QvxVAVJOxFP3Oro&e=>
> Copyright 1996-2011 Glyph & Cog, LLC
> 
> My popper is 0.6.2
> vcap at e0779423-b47e-499c-4c1b-4ecd:~/app/pop/bin$ ./pdfinfo -listenc 
> Available encodings are:
> ASCII7
> Big5
> Big5ascii
> EUC-CN
> EUC-JP
> GBK
> ISO-2022-CN
> ISO-2022-JP
> ISO-2022-KR
> ISO-8859-6
> ISO-8859-7
> ISO-8859-8
> ISO-8859-9
> KOI8-R
> Latin1
> Latin2
> Shift-JIS
> Symbol
> TIS-620
> UTF-16
> UTF-8
> Windows-1255
> ZapfDingbats
> 



More information about the poppler mailing list