[poppler] Getting a raster image like pdf2ppm

Albert Astals Cid aacid at kde.org
Sat May 23 03:08:50 PDT 2009


A Divendres, 22 de maig de 2009, vàreu escriure:
> Albert Astals Cid wrote:
> > A Dijous, 21 de maig de 2009, Angus March va escriure:
> >> Albert Astals Cid wrote:
> >>> A Dijous, 21 de maig de 2009, Angus March va escriure:
> >>>> Albert Astals Cid wrote:
> >>>>> A Dimecres, 20 de maig de 2009, Angus March va escriure:
> >>>>>> Albert Astals Cid wrote:
> >>>>>>> A Dimarts, 19 de maig de 2009, Angus March va escriure:
> >>>>>>>> Adrian Johnson wrote:
> >>>>>>>>> Angus March wrote:
> >>>>>>>>>> I tried using Poppler to get a Cairo surface and then saving the
> >>>>>>>>>> surface to a PNG. Unfortunately, the resulting image was of
> >>>>>>>>>> disastrously low quality.
> >>>>>>>>>
> >>>>>>>>> Without seeing your code or the output you are getting I can only
> >>>>>>>>> guess at what the problem might be. Did you alter the cairo scale
> >>>>>>>>> to get the desired image dpi?
> >>>>>>>>
> >>>>>>>>     It was definitely an improvement, but I think the only thing
> >>>>>>>> that did improve was the resolution. The old problems that caused
> >>>>>>>> me to abandon Cairo persisted, which are: gradients have ugly
> >>>>>>>> stripes on them, a background that should be white and opaque is
> >>>>>>>> black and transparent, and some text that has a shadow in the PDF
> >>>>>>>> doesn't in the image. I don't suppose you know of a way to deal
> >>>>>>>> w/those problems.
> >>>>>>>
> >>>>>>> ?
> >>>>>>>
> >>>>>>> I don't see anything obviously wrong.
> >>>>>>>
> >>>>>>> Basically it is:
> >>>>>>>  * Create PDFDoc
> >>>>>>>  * Create SplashOutputDev
> >>>>>>>  * Call SplashOutputDev::startDoc
> >>>>>>>  * Call PDFDoc::displayPageSlice
> >>>>>>
> >>>>>>     Well there definitely is something wrong, because it works with
> >>>>>> pdftoppm. I thought of things like the __attribute__((constructor))
> >>>>>> attribute, or static objects, but I don't see any evidence of the
> >>>>>> attribute and I wouldn't know how to find a static object in all
> >>>>>> that code. Maybe multiple processes causes problems for Splash.
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> It's hard to know where to go.
> >>>>>
> >>>>> The crashes you pasted are from poppler compiled with -O2? If so
> >>>>> remove the - O2 and substitute -g by -g3. Optimized poppler
> >>>>> backtraces are really misleading.
> >>>>
> >>>>     I figured out a way to get my app to build from the poppler lib I
> >>>> rolled myself (although I'd still like to know what the proper
> >>>> procedure is to get it to build in debug, and install the Splash
> >>>> stuff) and I got some valgrind reports that might be more helpful, but
> >>>> are fewer than those I got when I was using the SUSE distro's lib:
> >>>>
> >>>> ==8577== Conditional jump or move depends on uninitialised value(s)
> >>>> ==8577==    at 0x53DACE4: FoFiType1C::parse() (FoFiType1C.cc:1848)
> >>>> ==8577==    by 0x53E10AB: FoFiType1C::make(char*, int)
> >>>> (FoFiType1C.cc:35) ==8577==    by 0x5369A58:
> >>>> Gfx8BitFont::Gfx8BitFont(XRef*, char*, Ref, GooString*, GfxFontType,
> >>>> Dict*) (GfxFont.cc:699)
> >>>> ==8577==    by 0x536D72C: GfxFont::makeFont(XRef*, char*, Ref, Dict*)
> >>>> (GfxFont.cc:143)
> >>>> ==8577==    by 0x536D933: GfxFontDict::GfxFontDict(XRef*, Ref*, Dict*)
> >>>> (GfxFont.cc:2051)
> >>>> ==8577==    by 0x535AD21: GfxResources::GfxResources(XRef*, Dict*,
> >>>> GfxResources*) (Gfx.cc:313)
> >>>> ==8577==    by 0x535DD6B: Gfx::Gfx(XRef*, OutputDev*, int, Dict*,
> >>>> Catalog*, double, double, PDFRectangle*, PDFRectangle*, int, int
> >>>> (*)(void*), void*) (Gfx.cc:502)
> >>>> ==8577==    by 0x539AF12: Page::createGfx(OutputDev*, double, double,
> >>>> int, int, int, int, int, int, int, int, Catalog*, int (*)(void*),
> >>>> void*, int (*)(Annot*, void*), void*) (Page.cc:404)
> >>>> ==8577==    by 0x539B173: Page::displaySlice(OutputDev*, double,
> >>>> double, int, int, int, int, int, int, int, int, Catalog*, int
> >>>> (*)(void*), void*, int (*)(Annot*, void*), void*) (Page.cc:433)
> >>>> ==8577==    by 0x40A756: pdf2jpg::GetSplash(int) (pdf2jpg.cpp:176)
> >>>> ==8577==    by 0x40A9B5: pdf2jpg::TopupJpegThreads(int, astring
> >>>> const&) (pdf2jpg.cpp:156)
> >>>> ==8577==    by 0x40B3B1: pdf2jpg::Execute(int, char const*, char
> >>>> const*, int) (pdf2jpg.cpp:99)
> >>>> ==8577==
> >>>
> >>> Are you positively sure this doesn't happen with pdftoppm? Doesn't make
> >>> any sense.
> >>
> >>     It doesn't seem to be. I'll try running valgrind on the debug
> >> version of pdftoppm that I have here, and see what that does...
> >>     Well she hasn't reported any problems so far. I'll see tomorrow
> >> morning, then I guess I'll know for sure.
> >>     Also, I keep forgetting to point out that another problem my app has
> >> is with Splash getting stuck in an infinite loop every so often,
> >> requiring a kill -9.
> >>     How about this: I send you a sample of something that causes the
> >> problems. Compile this and run it through valgrind. It came across a few
> >> problems in a short time. BTW, for the sake of simplicity, it doesn't
> >> actually output any files. It just gets the raw image data from Splash.
> >
> > I see the problem, but i also see that
> >
> > pages=`/home/tsdgeos/cvs/poppler/build-new/utils/pdfinfo "$filename" |
> > grep Pages: | cut -c 1,2,3,4,5,6 --complement`
> > for index in $(seq 1 $pages); do
> >   echo -n "$index "
> >   /home/tsdgeos/cvs/poppler/build/utils/pdftoppm -f $index -l $index
> > "$filename" old/foo$index &
> > done
> > wait
> >
> > Does not have this problem, so there must be something that does not get
> > detached on fork?
>
>     I'm afraid I didn't understand any of that. What do you mean you see
> the problem?

No i don't see it

> What is it? And what does that script mean? 

This script runs N (where N is the number of pages) simultaneous pdftoppm 
programs on the same file rendering each one page, and it works

> What's this
> about detaching? Are you saying that forking is the problem?

Because my script does the same you [try to] do with program but on an upper 
level, and it works.

>     It could well be that forking is the problem. Fortunately, I don't
> need to fork at that point between the PDFDoc and the Splash instances.
> Unfortunately, I do need to fork at some point, since this app is a
> daemon. But, I keep the instantiation of PDFDoc and Splash in the same
> process.

What you should do is fork the earlier possible, that is, create PDFDoc and 
Splash inside the new process, not inside the daemon (i think you already do 
that, but better check it again)

Albert

> I've been running delete_this with the following change to main:
> int main(int argc, char *argv[]) {
>     assert(argc == 2);
>     bool bWorker = fork() == 0;
>     if (!bWorker) bWorker = fork() == 0;
>     if (bWorker) {
>         pdf2jpg thing;
>         thing.Execute(argv[1], "page", 1324);
>     }
>     else {
>         verify(::wait(NULL) > 0);
>         verify(::wait(NULL) > 0);
>     }
>
>   return EXIT_SUCCESS;
> }
>
> This way there'll be 2 processing running, but they will have their own
> PDFDocs. I've been running it now for a while, and valgrind hasn't had a
> single issue w/it yet.
>     Why Splash could have an issue w/this is hard to imagine, although
> at this point, I don't really care. Two different processes, running on
> different computers, in different countries even, can affect each other.
> But what Splash might be using that wouldn't be separated by a fork, I
> can only guess. Maybe a pipe, or socket. All that matters is that I can
> do it. Thanks for your input.




More information about the poppler mailing list