[poppler] pdftohtml image quality
Josh Richardson
jric at chegg.com
Tue Nov 22 08:42:32 PST 2011
My bad. I forgot that was something I added that hasn't been merged back in yet. I think your options are:
1. Use my version (email me offline if you want it, and I'll send you an invite to my source — it has other enhancements to pdftohtml also — read the mailing list archives for more info),
2. Change the source of pdftohtml.cc to make the default sampling 96 instead of 72 dpi, or
3. Wait for my changes to get merged back into the main repo. I'm not sure when that's going to be done.
Best, --josh
From: Craig Whitcombe <craig.whitcombe at gmail.com<mailto:craig.whitcombe at gmail.com>>
Date: Tue, 22 Nov 2011 06:20:20 -0800
To: Josh Richardson <jric at chegg.com<mailto:jric at chegg.com>>
Cc: "poppler at lists.freedesktop.org<mailto:poppler at lists.freedesktop.org>" <poppler at lists.freedesktop.org<mailto:poppler at lists.freedesktop.org>>
Subject: Re: [poppler] pdftohtml image quality
Sorry Josh, but I cannot see this -dpi setting
pdftohtml.exe -help
pdftohtml version 0.18.0
Copyright 2005-2011 The Poppler Developers - http://poppler.freedesktop.org
Copyright 1999-2003 Gueorgui Ovtcharov and Rainer Dorsch
Copyright 1996-2004 Glyph & Cog, LLC
Usage: pdftohtml [options] <PDF-file> [<html-file> <xml-file>]
-f <int> : first page to convert
-l <int> : last page to convert
-q : don't print any messages or errors
-h : print usage information
-help : print usage information
-p : exchange .pdf links by .html
-c : generate complex document
-s : generate single document that includes all pages
-i : ignore images
-noframes : generate no frames
-stdout : use standard output
-zoom <fp> : zoom the pdf document (default 1.5)
-xml : output for XML post-processing
-hidden : output hidden text
-nomerge : do not merge paragraphs
-enc <string> : output text encoding name
-dev <string> : output device name for Ghostscript (png16m, jpeg etc)
-fmt <string> : image file format for Splash output (png or jpg)
-v : print copyright and version info
-opw <string> : owner password (for encrypted files)
-upw <string> : user password (for encrypted files)
-nodrm : override document DRM settings
trying to use -dpi 96 anyway results in the above help message.
Regards,
Craig
On 22 November 2011 06:45, Josh Richardson <jric at chegg.com<mailto:jric at chegg.com>> wrote:
By default pdftohtml is sampling the original image at 72 dpi, whereas your browser is probably displaying it at least 96 dpi. I recommend you try bumping up the –dpi parameter.
--josh
From: Craig Whitcombe <craig.whitcombe at gmail.com<mailto:craig.whitcombe at gmail.com>>
Date: Sun, 20 Nov 2011 08:02:39 -0800
To: "poppler at lists.freedesktop.org<mailto:poppler at lists.freedesktop.org>" <poppler at lists.freedesktop.org<mailto:poppler at lists.freedesktop.org>>
Subject: [poppler] pdftohtml image quality
Hello,
Using pdftohtml -c to create a complex document from a pdf, I find that the generated png images are not very good when compared to the original inside the source pdf.
Is there something that I can do to improve the output quality?
Using version 0.18 with pdftohtml -c somepdf.pdf
Regards,
Craig
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.freedesktop.org/archives/poppler/attachments/20111122/ecab0ea0/attachment.html>
More information about the poppler
mailing list