[Poppler-bugs] [Bug 63088] New: poppler: file parsing infinite loop encountered with docs containing image masks (sample attached)

bugzilla-daemon at freedesktop.org bugzilla-daemon at freedesktop.org
Wed Apr 3 13:23:13 PDT 2013


https://bugs.freedesktop.org/show_bug.cgi?id=63088

          Priority: medium
            Bug ID: 63088
          Assignee: poppler-bugs at lists.freedesktop.org
           Summary: poppler: file parsing infinite loop encountered with
                    docs containing image masks (sample attached)
          Severity: normal
    Classification: Unclassified
                OS: All
          Reporter: ed at moto-research.com
          Hardware: All
            Status: NEW
           Version: unspecified
         Component: general
           Product: poppler

Created attachment 77390
  --> https://bugs.freedesktop.org/attachment.cgi?id=77390&action=edit
Sample document containing Image Mask causing poppler to get stuck in an
infinite loop

We are working on an internal tool that uses poppler for PDF processing and
have encountered a handful of documents that cause the poppler core to enter an
infinite loop. I've looked at a couple of them and it looks to be something
related to the parsing of image masks. This is happening both under linux and
OS X, linked against poppler 0.22.2.

I've confirmed the bug is in poppler and not our application as it is also seen
with pdftohtml. Enabling PrintCommands produces output that doesn't take long
for it to show the problem:

…
re 661.08 456.362 609.48 -104.88
f
cs /Cs6
scn 1 1 1
gs /GS1
  gfx state dict: << /SA false /SM 0.02 /Type /ExtGState >>
re 0 1 1 -1
f
scn 0.8 0.8 0.8
q
cm 1 0 0 -1 0 1
Do /Im1
Q
cs /Cs6
scn 1 1 1
gs /GS1
  gfx state dict: << /SA false /SM 0.02 /Type /ExtGState >>
re 0 1 1 -1
f
scn 0.8 0.8 0.8
q
cm 1 0 0 -1 0 1
Do /Im1
Q
cs /Cs6
scn 1 1 1
gs /GS1
  gfx state dict: << /SA false /SM 0.02 /Type /ExtGState >>
…

If I had to guess, an offset is not getting applied resulting in the same
object getting returned.  I realize there is a repeated graphic on the page but
by the time I killed pdftohtml (< 30s from starting it), there were around 140k
instances of the PNG written to disk and I'm pretty sure that can't be right :)

I've extracted a single page of one that shows the issue and have attached it. 
Please note that running it on the file will quickly create thousands of small
8x8 PNGs about 100 bytes in size.

There are two similar issues reported but they date back to 2010 and are marked
resolved so I'm not confident it is the same problem:

https://bugs.freedesktop.org/show_bug.cgi?id=28784
https://bugs.freedesktop.org/show_bug.cgi?id=28172

In the meantime, I'm trying to trace through the code to try and get an
understanding but I'm very unfamiliar with the Parser/Lexer portion of the
poppler core. Hope you can help and let me know if there's any other way I can
assist.

Thank you.

-- 
You are receiving this mail because:
You are the assignee for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.freedesktop.org/archives/poppler-bugs/attachments/20130403/927342d4/attachment.html>


More information about the Poppler-bugs mailing list