[poppler] poppler really slow when reading some documents

Albert Astals Cid aacid at kde.org
Wed Jan 4 16:19:32 PST 2006


A Dimarts 03 Gener 2006 22:55, Albert Astals Cid va escriure:
> A Dimarts 03 Gener 2006 22:21, Christian Krause va escriure:
> > On 1/3/06, Albert Astals Cid <aacid at kde.org> wrote:
> > > A Dimarts 03 Gener 2006 18:52, Christian Krause va escriure:
> > > > Hi,
> > > >
> > > > > > Sure. No difference. It is too slow to show the document:
> > > > >
> > > > > Not with evince. Can you try with the test applications inside
> > > > > poppler (especially qt4/tests/test-poppler-qt4 )?
> > > >
> > > > 1st test
> > > > qt/test-poppler-qt /tmp/serialata10a.pdf
> > > >
> > > > the first page ist displayed within a second, switching to other
> > > > pages is fast, too
> > > >
> > > > 2nd test
> > > > Sorry, but I can't run the QT4 tests, because I haven't qt4 installed
> > > > yet and it is not available for my distribution.
> > > >
> > > > 3rd test
> > > > glib/test-poppler-glib /tmp/serialata10a.pdf
> > > >
> > > > nothing is displayed, 100% CPU usage
> > > >
> > > > > Can you try with KPDF from KDE 3.5?
> > > >
> > > > Yes, it doesn't work. But this depends on the fact, that kdegraphics
> > > > 3.5.0 has only "theoretical" poppler support. Something is linked
> > > > against poppler, but the xpdf stuff and the goo lib is built as well
> > > > (even if the project is configured --with-poppler) and used (as seen
> > > > in gdb).
> > >
> > > kpdf does not use poppler because our copy of xpdf is somewhat better,
> > > but the fact that we share 99% of the code is real.
> > >
> > > The backtrace you attach seem to imply the problem is in too large
> > > document table of contents.
> > >
> > > I'll have a look later on.
> >
> > You can look at the first message of this thread, where I'll explain
> > the problem in detail. The problematic piece of code is IMHO in
> > kdegraphics-3.5.0/kpdf/xpdf/goo/GString.cc.
>
> Well, that is because you don't have any idea of what the problem is, your
> fix does not fix the real problem, only the effects of the real bug. That
> is the pdf file is wrong and we are not able to detect it is wrong so we
> try to load all the file when loading one of the TOC titles, FYI the
> problematic TOC entry is
>
> 2239 0 obj^M<< ^M/Title (6.6.4.3 Sampling jitter specifications relate to
> the relationship between the sampling clock and the data.  Any phase error
> that results in the sample b\)^M/Dest [ 106
>
> The Title "should" end at \)^M, but in fact \) is the escape sequence to
> say "hey, this ) is not the end of the title but a real )", so we keep
> reading until god knows when, making a GooString of humongous size.
> Obviously your patch to "speed up" GooString allocation helps here, but the
> real fix is detecting that the Title really ended.
>
> Anyone has any idea of how to detect that error?

Well, doing some more investigation seems Acrobat detects there is something 
fishy but is not able to recover from it, open the document and try to 
navigate to 6.6.4.3 or 6.6.4.4	and you'll realize they are not in the TOC (at 
least in linux acrobat 7.0.1)

As an idea to catch (not recover from) that kind of errors i think we could 
try something like 

if you had a \) then either a space a \r a \n or combinations of them and 
then /Dest [ 
then we assume the \) was a closing ) really and return, we loose the current 
item but do not stay forever trying to find the end. That may introduce 
errors in case someone is such a ill person that introduces a TOC item with 
these exact characters.

What do you think?

Albert

>
> Albert
>
> > Best regards,
> > Christian
> > _______________________________________________
> > poppler mailing list
> > poppler at lists.freedesktop.org
> > http://lists.freedesktop.org/mailman/listinfo/poppler
>
> ______________________________________________
> Renovamos el Correo Yahoo!
> Nuevos servicios, más seguridad
> http://correo.yahoo.es
> _______________________________________________
> poppler mailing list
> poppler at lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/poppler

		
______________________________________________ 
Renovamos el Correo Yahoo! 
Nuevos servicios, más seguridad 
http://correo.yahoo.es


More information about the poppler mailing list