[poppler] Help with pdftohtml background image resolution

ChunWei Ho fuzzybr80 at gmail.com
Fri Aug 13 09:24:22 PDT 2010


Hi,

Thanks mpsuzuki, for looking at this. I appreciate your help, and
hopefully the information here adds to it.

You are right that gs is used for the images:
The command is generally:
running: gs -sDEVICE=png16m -dBATCH -dNOPROMPT -dNOPAUSE -r<R>
-sOutputFile="/root/test/ss3%03d.png" -g<X>x<Y> -q "/root/test/ss3.ps"

where <R> = 72 * scale
and <X> and <Y> are almost always 595x842 despite what scale you pick.
Its strange - these are the values of htmlOut->getPageWidth() and
htmlOut->getPageHeight (I don't know where they are set). They are
divided by scale to form w and h, then multiplied back with scale to
form tw and th (which are used in the command line).

I've tried adjusting those values there but it appears to distort the
output. I'm not a graphics person (so I have no idea how the -r value
fits in with the -gXxY value).

Do you see a way to change the basic htmlOut page width and height
(which appears to be the arbitrary limit here).

Thanks!



On Fri, Aug 13, 2010 at 5:29 PM,  <mpsuzuki at hiroshima-u.ac.jp> wrote:
> Hi,
>
> I'm sorry for too late joining to this disucssion.
> As ChunWei Ho had already filed this issue in bugzilla,
> should I discuss in there? If so, please let me know.
>
> Taking a glance on utils/pdftohtml.cc, sorry this is my
> first observation of it, I found that pdftohtml does not
> make the images by poppler. pdftohtml makes the text-based
> part by HtmlOutputDev of the poppler, but the image parts
> are created by running external Ghostscript.
>
> And, the resolution seems to be 72dpi x scaling parameter
> (given by zoom).
>
>         66 double scale=1.5;
>        ...
>        108   {"-zoom",   argFP,    &scale,         0,
>        109    "zoom the pdf document (default 1.5)"},
>        ...
>        253    if (scale>3.0) scale=3.0;
>        254    if (scale<0.5) scale=0.5;
>        ...
>        360     /*sprintf(buf, "%s -sDEVICE=png16m -dBATCH -dNOPROMPT -dNOPAUSE -r72 -sOutputFile=%s%%03d.png -g%dx%d -q %s", GHOSTSCRIPT, htmlFile    Name->getCString(), w, h,
>        361       psFileName->getCString());*/
>        362
>        363     GooString *gsCmd = new GooString(GHOSTSCRIPT);
>        364     GooString *tw, *th, *sc;
>        365     gsCmd->append(" -sDEVICE=");
>        366         gsCmd->append(gsDevice);
>        367         gsCmd->append(" -dBATCH -dNOPROMPT -dNOPAUSE -r");
>        368     sc = GooString::fromInt(static_cast<int>(72*scale));
>        369     gsCmd->append(sc);
>
> So, changing around this part may work to obtain high resolution
> background image. Albert, please give me your comment if it's
> right direction. I will work to add new option "-r" to modify
> the default resolution to be passed to Ghostscript.
>
> I wish if I had sufficient sparetime to replace the background
> image part from Ghostscript to poppler, but now I don't have...
>
> Regards,
> mpsuzuki
>
> On Fri, 13 Aug 2010 15:30:57 +0800
> ChunWei Ho <fuzzybr80 at gmail.com> wrote:
>
>>I also tried poppler-0.5.91 (earliest that builds for me), but that
>>has the same issue. I tried looking into/diffing the code but not
>>seeing an obvious fix/issue there.
>>
>>I've logged a bug at https://bugs.freedesktop.org/show_bug.cgi?id=29551
>>
>>Its probably not affecting too many users, but I appreciate if it can
>>be investigated soon as it would be great to be able to deploy
>>poppler-utils for our purposes.
>>
>>Thanks.
>>
>>>>> I've been using pdftohtml (http://pdftohtml.sourceforge.net/) for PDF
>>>>> to HTML conversion for my application, and recently tried to upgrade
>>>>> it to use poppler-utils.
>>>>> I usually invoke it as "pdftohtml -c -noframes [input pdf] [output html]"
>>>>> The commandline interface and all is fine but the images (I understand
>>>>> a background image is generated per page) is now really bad. I did a
>>>>> check and under the old pdftohtml project, each background image (PNG)
>>>>> for a page is 1785x2526 resolution.
>>>>>
>>>>> Under poppler-utils, each background image (PNG) is 594x843 resolution.
>>>>>
>>>>> Can someone point me in the right direction to change/fix this? There
>>>>> doesn't appear to be a command line parameter for this.
>>>>> The new background images are bad to the extent of unusable. Which is
>>>>> a shame, because I really want to move to poppler-utils for the
>>>>> unicode and continued support.
>>>
>>>>Which poppler version are you using?
>>>
>>>>Albert
>>>
>>_______________________________________________
>>poppler mailing list
>>poppler at lists.freedesktop.org
>>http://lists.freedesktop.org/mailman/listinfo/poppler
>


More information about the poppler mailing list