[poppler] pdftohtml : enhancing it to use embedded fonts

Leonard Rosenthol lrosenth at adobe.com
Mon Jun 6 06:44:16 PDT 2011

I can tell you very clearly that at least in the United States, doing this is a CLEAR violation of font copyright law.  It's even more clear if you do any format conversion that's VERY MUCH a violation.

As is pointed out elsewhere, there is a flag in TTF/OTF font files that could be used, however in many cases that flag doesn't survive conversion to PDF.

AND even if you decide to tackle this after all, you still have a LOT of technical hurdles to overcome including (but not limited to):

-          Subset fonts in PDF don't work in HTML (CID vs. GID mapping)

-          CFF-based OTFs aren't supported by most browsers

-          Type 1 fonts aren't support by most browsers


From: poppler-bounces+leonardr=adobe.com at lists.freedesktop.org [mailto:poppler-bounces+leonardr=adobe.com at lists.freedesktop.org] On Behalf Of Josh Richardson
Sent: Sunday, June 05, 2011 3:42 PM
To: poppler at lists.freedesktop.org
Subject: [poppler] pdftohtml : enhancing it to use embedded fonts

The current systems appears to try and use system-available fonts as approximations for whatever font is in the PDF.  For pdftohtml, I am considering adding in a preferred behavior:

1.  Extract the original font from the PDF
2.  Create a font file for that font
3.  Reference the font file, using "@font-face" in the generated HTML.

This should give us an exact representation of the original font in the PDF, though it will only work with modern browsers, since earlier browsers don't support "@font-face".  For IE, I'll have to convert the font to EOT, and for the others I'll probably use regular OpenType (not TrueType) format.

If I only use the extracted font to display the original document in it's original form, and not to draw additional glyphs in any document, I believe I'll be in compliance with "fair use" and digital copyright rules for the font.

Does anyone see an issue with the approach, or have any advice?  For instance, I'm not sure how much luck I'll have with converting especially Type 3 fonts to OpenType/EOT.

Thanks, --josh
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.freedesktop.org/archives/poppler/attachments/20110606/85d0dd7b/attachment.htm>

More information about the poppler mailing list