<html>
<head>
<base href="https://bugs.freedesktop.org/" />
</head>
<body><table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Priority</th>
<td>medium
</td>
</tr>
<tr>
<th>Bug ID</th>
<td><a class="bz_bug_link
bz_status_NEW "
title="NEW --- - poppler: file parsing infinite loop encountered with docs containing image masks (sample attached)"
href="https://bugs.freedesktop.org/show_bug.cgi?id=63088">63088</a>
</td>
</tr>
<tr>
<th>Assignee</th>
<td>poppler-bugs@lists.freedesktop.org
</td>
</tr>
<tr>
<th>Summary</th>
<td>poppler: file parsing infinite loop encountered with docs containing image masks (sample attached)
</td>
</tr>
<tr>
<th>Severity</th>
<td>normal
</td>
</tr>
<tr>
<th>Classification</th>
<td>Unclassified
</td>
</tr>
<tr>
<th>OS</th>
<td>All
</td>
</tr>
<tr>
<th>Reporter</th>
<td>ed@moto-research.com
</td>
</tr>
<tr>
<th>Hardware</th>
<td>All
</td>
</tr>
<tr>
<th>Status</th>
<td>NEW
</td>
</tr>
<tr>
<th>Version</th>
<td>unspecified
</td>
</tr>
<tr>
<th>Component</th>
<td>general
</td>
</tr>
<tr>
<th>Product</th>
<td>poppler
</td>
</tr></table>
<p>
<div>
<pre>Created <span class=""><a href="attachment.cgi?id=77390" name="attach_77390" title="Sample document containing Image Mask causing poppler to get stuck in an infinite loop">attachment 77390</a> <a href="attachment.cgi?id=77390&action=edit" title="Sample document containing Image Mask causing poppler to get stuck in an infinite loop">[details]</a></span>
Sample document containing Image Mask causing poppler to get stuck in an
infinite loop
We are working on an internal tool that uses poppler for PDF processing and
have encountered a handful of documents that cause the poppler core to enter an
infinite loop. I've looked at a couple of them and it looks to be something
related to the parsing of image masks. This is happening both under linux and
OS X, linked against poppler 0.22.2.
I've confirmed the bug is in poppler and not our application as it is also seen
with pdftohtml. Enabling PrintCommands produces output that doesn't take long
for it to show the problem:
…
re 661.08 456.362 609.48 -104.88
f
cs /Cs6
scn 1 1 1
gs /GS1
gfx state dict: << /SA false /SM 0.02 /Type /ExtGState >>
re 0 1 1 -1
f
scn 0.8 0.8 0.8
q
cm 1 0 0 -1 0 1
Do /Im1
Q
cs /Cs6
scn 1 1 1
gs /GS1
gfx state dict: << /SA false /SM 0.02 /Type /ExtGState >>
re 0 1 1 -1
f
scn 0.8 0.8 0.8
q
cm 1 0 0 -1 0 1
Do /Im1
Q
cs /Cs6
scn 1 1 1
gs /GS1
gfx state dict: << /SA false /SM 0.02 /Type /ExtGState >>
…
If I had to guess, an offset is not getting applied resulting in the same
object getting returned. I realize there is a repeated graphic on the page but
by the time I killed pdftohtml (< 30s from starting it), there were around 140k
instances of the PNG written to disk and I'm pretty sure that can't be right :)
I've extracted a single page of one that shows the issue and have attached it.
Please note that running it on the file will quickly create thousands of small
8x8 PNGs about 100 bytes in size.
There are two similar issues reported but they date back to 2010 and are marked
resolved so I'm not confident it is the same problem:
<a class="bz_bug_link
bz_status_RESOLVED bz_closed"
title="RESOLVED FIXED - poppler: stream object /Length attribute parsing infinite loop and stack memory exhaustion"
href="show_bug.cgi?id=28784">https://bugs.freedesktop.org/show_bug.cgi?id=28784</a>
<a class="bz_bug_link
bz_status_RESOLVED bz_closed"
title="RESOLVED FIXED - poppler: xref / XRefStm infinite loop and stack memory exhaustion"
href="show_bug.cgi?id=28172">https://bugs.freedesktop.org/show_bug.cgi?id=28172</a>
In the meantime, I'm trying to trace through the code to try and get an
understanding but I'm very unfamiliar with the Parser/Lexer portion of the
poppler core. Hope you can help and let me know if there's any other way I can
assist.
Thank you.</pre>
</div>
</p>
<hr>
<span>You are receiving this mail because:</span>
<ul>
<li>You are the assignee for the bug.</li>
</ul>
</body>
</html>