[poppler] Poppler and MOAB-06-01-2007

Jeff Muizelaar jeff at infidigm.net
Thu Jan 11 06:26:30 PST 2007


On Thu, Jan 11, 2007 at 08:45:48AM -0500, Kristian Høgsberg wrote:
> On 1/10/07, Vincent Torri <vtorri at univ-evry.fr> wrote:
> >On Wed, 10 Jan 2007, Albert Astals Cid wrote:
> >
> >> but i'm not going to commit it to poppler since krh oposes completely to 
> >use
> >> stl for poppler.
> >
> >what is the reason ?
> 
> It's not about performance, it's not about how well supported stl is
> or isn't.  The thing it that when you contribute to a project you try
> to follow the conventions already established in that project.  If a
> project uses C++ you don't submit a python patch.  If a project uses
> STL, you don't reimplement a set data structure.  If a project doesn't
> use STL you don't just pull it in for a 50 line patch.  It's not
> rocket science it's not me being a unreasonable maintainer, it's just
> common sense.  Stop being a drama queen.
> 
> If we wanted to use STL there are so many more interesting cases than
> this fix.  For example, all the crappy reimplementations of various
> standard data structures in goo/.  I don't really want to shake up
> poppler in a big way like that, though.  We don't need big rewrites
> unless we actually add new features in the process.
> 
> And the problem in this case is how to prevent looping when we have
> circular references in the page tree.  To prevent this we can do two
> things: the one idea that Albert doesn't like is to just limit the
> depth of the page tree.  We can either choose a fixed limit or use the
> total number of references as a limit.  If there is a page tree chain
> longer than, say, 1000, the document is most certainly malicious.  On
> the other hand, we can easily handle a recusion level of 1000, so we
> can detect it and bail out safely in that case.
> 
> The other idea is to just put a 'visited' bit in each node of the page
> tree.  If wee see a page that already has the 'visited' bit set we
> know something is wrong and we can bail out right away.

I looked at this idea first. However, I didn't see how it could fit
easily with the exisiting code structure. The solution I came with is to
have a bitmap of size XRef->size(), then for each object we visit mark
the bit corresponding to that object. If the bit is already set we know
that we have a loop in the tree.i

e.g.:

class GooBitmap {
GooBitmap(int size);
testBit(int n);
setBit(int n);
}


I might get time to code this up, however, I am pretty busy so I can't
make any promises on the timeframe.  If anyone wants to beat me to it go
ahead.

-Jeff


More information about the poppler mailing list