[poppler] pdftohtml : enhancing it to use embedded fonts

Josh Richardson jric at chegg.com
Mon Jun 6 10:49:54 PDT 2011

I chose OpenType over TrueType based on the contents of this article:

In general, I am assuming that all modern browsers will support OpenType,
and given it's more capable, seems like the best starting point.  Do you
have reasons we should default to TrueType instead?

I agree with you that the font-extraction process should be done
separately from Poppler, and FontForge is the solution I will try.  I will
set it up so that if the user has the required software, they can use it,
and if not, they will get the current mappings.

Thanks everybody for the pointers.

Best, --josh

On 6/5/11 6:31 PM, "mpsuzuki at hiroshima-u.ac.jp"
<mpsuzuki at hiroshima-u.ac.jp> wrote:

>Although I don't have sufficient time to work this feature,
>I remember there was similar request before a few months,
>so I appreciate if you will work (or you can find any
>volunteer to work) with it.
>On Sun, 5 Jun 2011 15:41:53 -0700
>Josh Richardson <jric at chegg.com> wrote:
>>This should give us an exact representation of the original font in the
>>PDF, though it will only work with modern browsers, since earlier
>>browsers don't support "@font-face".  For IE, I'll have to convert the
>>font to EOT, and for the others I'll probably use regular OpenType (not
>>TrueType) format.
>Excuse me, why "regular OpenType (not TrueType)"?
>I propose to consider WOFF as the first milestone instead
>of EOF, although I think that still the most people is
>using HTML browsers without WOFF support. The reasons are:
>* The W3C standardization of EOT is almost stopped since 2008,
>  but WOFF is being in the process to be a W3C recommendation.
>* WOFF would be cross platform solution for newer HTML
>  browsers; IE, Firefox, Chrome, Safari etc. But EOT
>  would not be cross platform solutions.
>* The patent issue of EOT is not clarified yet. EOT uses the
>  patent to compress the font, owned by Agfa (not Microsoft!),
>  and there is no explicit permission to use without royalty.
>  It seems that the participants from Agfa was willing to
>  permit(*), but the EOT standardization process is stopped
>  now, so no explicit permission is not documented yet.
>  (*) See http://www.w3.org/Fonts/Misc/minutes-2008-10 and
>  find the comment by Vladimir saying that Agfa's patents
>  were contributed on an royalty free basis.
>  I heard that there are some GPLv2 softwares dealing with EOT,
>  but I'm not sure if they are implementing the free part of
>  EOT spec or they think EOT is already royalty free technology.
>>Does anyone see an issue with the approach, or have any advice?  For
>>instance, I'm not sure how much luck I'll have with converting
>>especially Type 3 fonts to OpenType/EOT.
>Either I don't have good idea about Type3 fonts. The Type 3
>font in PDF is described by PDF instructions, so something like
>PDF renderer having PS Type1 (or TrueType outline instructions)
>as its output device. Fontforge can convert SVG graphics to
>Type1 or TrueType fonts, but linking fontforge with poppler is
>overkill. Making some SVG pictures and a script to convert them
>to a font by external fontforge could be the first step...

More information about the poppler mailing list