[poppler] Help with pdftohtml background image resolution

mpsuzuki at hiroshima-u.ac.jp mpsuzuki at hiroshima-u.ac.jp
Fri Aug 13 02:29:21 PDT 2010


Hi,

I'm sorry for too late joining to this disucssion.
As ChunWei Ho had already filed this issue in bugzilla,
should I discuss in there? If so, please let me know.

Taking a glance on utils/pdftohtml.cc, sorry this is my
first observation of it, I found that pdftohtml does not
make the images by poppler. pdftohtml makes the text-based
part by HtmlOutputDev of the poppler, but the image parts
are created by running external Ghostscript.

And, the resolution seems to be 72dpi x scaling parameter
(given by zoom).

	 66 double scale=1.5;
	...
	108   {"-zoom",   argFP,    &scale,         0,
	109    "zoom the pdf document (default 1.5)"},
	...
	253    if (scale>3.0) scale=3.0;
	254    if (scale<0.5) scale=0.5;
	...
	360     /*sprintf(buf, "%s -sDEVICE=png16m -dBATCH -dNOPROMPT -dNOPAUSE -r72 -sOutputFile=%s%%03d.png -g%dx%d -q %s", GHOSTSCRIPT, htmlFile    Name->getCString(), w, h,
	361       psFileName->getCString());*/
	362 
	363     GooString *gsCmd = new GooString(GHOSTSCRIPT);
	364     GooString *tw, *th, *sc;
	365     gsCmd->append(" -sDEVICE=");
	366         gsCmd->append(gsDevice);
	367         gsCmd->append(" -dBATCH -dNOPROMPT -dNOPAUSE -r");
	368     sc = GooString::fromInt(static_cast<int>(72*scale));
	369     gsCmd->append(sc);

So, changing around this part may work to obtain high resolution
background image. Albert, please give me your comment if it's
right direction. I will work to add new option "-r" to modify
the default resolution to be passed to Ghostscript.

I wish if I had sufficient sparetime to replace the background
image part from Ghostscript to poppler, but now I don't have...

Regards,
mpsuzuki

On Fri, 13 Aug 2010 15:30:57 +0800
ChunWei Ho <fuzzybr80 at gmail.com> wrote:

>I also tried poppler-0.5.91 (earliest that builds for me), but that
>has the same issue. I tried looking into/diffing the code but not
>seeing an obvious fix/issue there.
>
>I've logged a bug at https://bugs.freedesktop.org/show_bug.cgi?id=29551
>
>Its probably not affecting too many users, but I appreciate if it can
>be investigated soon as it would be great to be able to deploy
>poppler-utils for our purposes.
>
>Thanks.
>
>>>> I've been using pdftohtml (http://pdftohtml.sourceforge.net/) for PDF
>>>> to HTML conversion for my application, and recently tried to upgrade
>>>> it to use poppler-utils.
>>>> I usually invoke it as "pdftohtml -c -noframes [input pdf] [output html]"
>>>> The commandline interface and all is fine but the images (I understand
>>>> a background image is generated per page) is now really bad. I did a
>>>> check and under the old pdftohtml project, each background image (PNG)
>>>> for a page is 1785x2526 resolution.
>>>>
>>>> Under poppler-utils, each background image (PNG) is 594x843 resolution.
>>>>
>>>> Can someone point me in the right direction to change/fix this? There
>>>> doesn't appear to be a command line parameter for this.
>>>> The new background images are bad to the extent of unusable. Which is
>>>> a shame, because I really want to move to poppler-utils for the
>>>> unicode and continued support.
>>
>>>Which poppler version are you using?
>>
>>>Albert
>>
>_______________________________________________
>poppler mailing list
>poppler at lists.freedesktop.org
>http://lists.freedesktop.org/mailman/listinfo/poppler


More information about the poppler mailing list