[poppler] Font info not getting properly into html when using pdftohtml
Sushant Sinha
sushant354 at gmail.com
Tue Feb 8 07:28:45 PST 2011
I have attached a pdf document which is a mix of english and hindi
languages. For Hindi it uses Aryan2 font. When I use pdftohtml on this
doc, I do not get any font information in the html file. When I use the
"-xml" or the "-c" Aryan2 font is still outputted as Times. So there is
some problem with embedded fonts.
I have attached the pdf doc for your analysis.
$ pdffonts 2211.pdf
name type emb sub uni
object ID
------------------------------------ ----------------- --- --- ---
---------
CFFEEL+TimesNewRoman TrueType yes yes no 1852
0
CFFEGM+TimesNewRoman,Bold TrueType yes yes no
1854 0
CFFFEJ+TimesNewRoman,Italic TrueType yes yes no
93 0
CFFFHI+SymbolMT CID TrueType yes yes yes
94 0
CFFGDG+Aryan2-Bold TrueType yes yes no
95 0
CFFGEI+Aryan2-Normal TrueType yes yes no
97 0
CFFGEH+Aryan2-Normal CID TrueType yes yes yes
96 0
CFFGII+Tahoma,Bold TrueType yes yes no
98 0
CFFGLJ+Tahoma TrueType yes yes no
99 0
Can someone tell me why is this happening?
-Sushant.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 2211.pdf
Type: application/pdf
Size: 406452 bytes
Desc: not available
URL: <http://lists.freedesktop.org/archives/poppler/attachments/20110208/d24d75be/attachment-0001.pdf>
More information about the poppler
mailing list