[poppler] pdftohtml : enhancing it to use embedded fonts

Leonard Rosenthol lrosenth at adobe.com
Mon Jun 6 17:14:23 PDT 2011


Josh, you should do a bit more research.

OpenType is a wrapper technology.  There are two types of OpenType fonts - SFNT-based and CFF-based.  SFNT-based are basically TrueType fonts bundled inside of a .otf - in fact, that's what MSFT ships with Windows and they actually mix and match the file extensions (sometimes labeling an OTF as .ttf).   CFF-based OpenType fonts, however, are what Adobe ships in its font library as they are consider higher quality by designers & publishers.

As such, not all OpenType rendering systems are created equal.  Some browsers (eg. MSIE) tend to favor the SFNT-based fonts causing them to look nicer/cleaner in yoru web pages, while Safari, for example, does a great job with both.   So you need to give this a MUCH deeper look.

Leonard

-----Original Message-----
From: poppler-bounces+leonardr=adobe.com at lists.freedesktop.org [mailto:poppler-bounces+leonardr=adobe.com at lists.freedesktop.org] On Behalf Of Josh Richardson
Sent: Monday, June 06, 2011 1:50 PM
To: mpsuzuki at hiroshima-u.ac.jp
Cc: poppler at lists.freedesktop.org
Subject: Re: [poppler] pdftohtml : enhancing it to use embedded fonts

I chose OpenType over TrueType based on the contents of this article:
http://www.brighthub.com/multimedia/publishing/articles/80115.aspx

In general, I am assuming that all modern browsers will support OpenType, and given it's more capable, seems like the best starting point.  Do you have reasons we should default to TrueType instead?

I agree with you that the font-extraction process should be done separately from Poppler, and FontForge is the solution I will try.  I will set it up so that if the user has the required software, they can use it, and if not, they will get the current mappings.

Thanks everybody for the pointers.

Best, --josh

On 6/5/11 6:31 PM, "mpsuzuki at hiroshima-u.ac.jp"
<mpsuzuki at hiroshima-u.ac.jp> wrote:

>Hi,
>
>Although I don't have sufficient time to work this feature, I remember 
>there was similar request before a few months, so I appreciate if you 
>will work (or you can find any volunteer to work) with it.
>
>On Sun, 5 Jun 2011 15:41:53 -0700
>Josh Richardson <jric at chegg.com> wrote:
>>This should give us an exact representation of the original font in 
>>the PDF, though it will only work with modern browsers, since earlier 
>>browsers don't support "@font-face".  For IE, I'll have to convert the 
>>font to EOT, and for the others I'll probably use regular OpenType 
>>(not
>>TrueType) format.
>
>Excuse me, why "regular OpenType (not TrueType)"?
>
>--
>
>I propose to consider WOFF as the first milestone instead of EOF, 
>although I think that still the most people is using HTML browsers 
>without WOFF support. The reasons are:
>
>* The W3C standardization of EOT is almost stopped since 2008,
>  but WOFF is being in the process to be a W3C recommendation.
>
>* WOFF would be cross platform solution for newer HTML
>  browsers; IE, Firefox, Chrome, Safari etc. But EOT
>  would not be cross platform solutions.
>
>* The patent issue of EOT is not clarified yet. EOT uses the
>  patent to compress the font, owned by Agfa (not Microsoft!),
>  and there is no explicit permission to use without royalty.
>  It seems that the participants from Agfa was willing to
>  permit(*), but the EOT standardization process is stopped
>  now, so no explicit permission is not documented yet.
>
>  (*) See http://www.w3.org/Fonts/Misc/minutes-2008-10 and  find the 
> comment by Vladimir saying that Agfa's patents  were contributed on an 
> royalty free basis.
>
>  I heard that there are some GPLv2 softwares dealing with EOT,  but 
> I'm not sure if they are implementing the free part of  EOT spec or 
> they think EOT is already royalty free technology.
>
>>Does anyone see an issue with the approach, or have any advice?  For 
>>instance, I'm not sure how much luck I'll have with converting 
>>especially Type 3 fonts to OpenType/EOT.
>
>Either I don't have good idea about Type3 fonts. The Type 3 font in PDF 
>is described by PDF instructions, so something like PDF renderer having 
>PS Type1 (or TrueType outline instructions) as its output device. 
>Fontforge can convert SVG graphics to
>Type1 or TrueType fonts, but linking fontforge with poppler is 
>overkill. Making some SVG pictures and a script to convert them to a 
>font by external fontforge could be the first step...
>
>Regards,
>mpsuzuki
>

_______________________________________________
poppler mailing list
poppler at lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/poppler


More information about the poppler mailing list