[poppler] pdftohtml image quality

Craig Whitcombe craig.whitcombe at gmail.com
Tue Nov 22 09:53:03 PST 2011


I'll take option 3 for now, this is only a minor annoyance.

Cheers,
Craig

On 22 November 2011 17:42, Josh Richardson <jric at chegg.com> wrote:

> My bad.  I forgot that was something I added that hasn't been merged back
> in yet.  I think your options are:
>
>    1. Use my version (email me offline if you want it, and I'll send you
>    an invite to my source — it has other enhancements to pdftohtml also — read
>    the mailing list archives for more info),
>    2. Change the source of pdftohtml.cc to make the default sampling 96
>    instead of 72 dpi, or
>    3. Wait for my changes to get merged back into the main repo.  I'm not
>    sure when that's going to be done.
>
> Best, --josh
>
> From: Craig Whitcombe <craig.whitcombe at gmail.com>
> Date: Tue, 22 Nov 2011 06:20:20 -0800
> To: Josh Richardson <jric at chegg.com>
> Cc: "poppler at lists.freedesktop.org" <poppler at lists.freedesktop.org>
> Subject: Re: [poppler] pdftohtml image quality
>
> Sorry Josh, but I cannot see this -dpi setting
>
> pdftohtml.exe -help
>
> pdftohtml version 0.18.0
> Copyright 2005-2011 The Poppler Developers -
> http://poppler.freedesktop.org
> Copyright 1999-2003 Gueorgui Ovtcharov and Rainer Dorsch
> Copyright 1996-2004 Glyph & Cog, LLC
>
> Usage: pdftohtml [options] <PDF-file> [<html-file> <xml-file>]
>   -f <int>          : first page to convert
>   -l <int>          : last page to convert
>   -q                : don't print any messages or errors
>   -h                : print usage information
>   -help             : print usage information
>   -p                : exchange .pdf links by .html
>   -c                : generate complex document
>   -s                : generate single document that includes all pages
>   -i                : ignore images
>   -noframes         : generate no frames
>   -stdout           : use standard output
>   -zoom <fp>        : zoom the pdf document (default 1.5)
>   -xml              : output for XML post-processing
>   -hidden           : output hidden text
>   -nomerge          : do not merge paragraphs
>   -enc <string>     : output text encoding name
>   -dev <string>     : output device name for Ghostscript (png16m, jpeg etc)
>   -fmt <string>     : image file format for Splash output (png or jpg)
>   -v                : print copyright and version info
>   -opw <string>     : owner password (for encrypted files)
>   -upw <string>     : user password (for encrypted files)
>   -nodrm            : override document DRM settings
>
>
> trying to use -dpi 96 anyway results in the above help message.
>
> Regards,
> Craig
>
>
>
>
>
> On 22 November 2011 06:45, Josh Richardson <jric at chegg.com> wrote:
>
>> By default pdftohtml is sampling the original image at 72 dpi, whereas
>> your browser is probably displaying it at least 96 dpi.  I recommend you
>> try bumping up the –dpi parameter.
>>
>> --josh
>>
>> From: Craig Whitcombe <craig.whitcombe at gmail.com>
>> Date: Sun, 20 Nov 2011 08:02:39 -0800
>> To: "poppler at lists.freedesktop.org" <poppler at lists.freedesktop.org>
>> Subject: [poppler] pdftohtml image quality
>>
>> Hello,
>>
>> Using pdftohtml -c to create a complex document from a pdf, I find that
>> the generated png images are not very good when compared to the original
>> inside the source pdf.
>>
>> Is there something that I can do to improve the output quality?
>>
>> Using version 0.18 with pdftohtml -c somepdf.pdf
>> Regards,
>> Craig
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.freedesktop.org/archives/poppler/attachments/20111122/8454783e/attachment.htm>


More information about the poppler mailing list