[poppler] [PATCH] Catalog::getNumPages(): validate page count

Albert Astals Cid aacid at kde.org
Sun Sep 20 11:18:12 PDT 2015


El Dijous, 17 de setembre de 2015, a les 11:49:27, Jason Crain va escriure:
> On 2015-09-17 08:57, Leonard Rosenthol wrote:
> > While it is unclear in ISO 32000-1 whether such a PDF is invalid, we
> > made it clear in 32000-2 that you can only have one copy of each page
> > in the Pages tree.  So personally, I wouldn’t waste much time on this
> > particular file.
> > 
> > Leonard
> 
> OK, if it's not allowed by the spec, I have no real objection to the
> object count check.

Pushed.

Cheers,
  Albert

> 
> > On 9/17/15, 1:04 AM, "poppler on behalf of Jason Crain"
> > <poppler-bounces at lists.freedesktop.org on behalf of
> > 
> > jason at aquaticape.us> wrote:
> >> On Wed, Sep 16, 2015 at 09:05:58PM -0400, William Bader wrote:
> >>> > > I don't know of a good way to validate the page count. Even
> >>> > > going through the page tree might be hard to do right without
> >>> > > leading to an infinite loop, in addition to being slow.
> >>> > 
> >>> > Catalog::cachePageTree goes over the tree, but i agree doing that
> >>> > to calculate the num of pages can be meh.
> >>> 
> >>> If the number of pages is huge, the PDF might be intentionally
> >>> corrupted to provoke a bug in a particular PDF viewer, and other
> >>> data structures could be subtly corrupted as well. Any scan would
> >>> have to proceed very cautiously.
> >>> 
> >>> If there is a minimum number of objects required for a page, and if
> >>> the total number of objects is easy to find, could poppler
> >>> immediately reject files with (total num objects) / (min objects per
> >>> page) < page count?
> >> 
> >> The document at
> >> https://drive.google.com/open?id=0ByTyiZeyQ4p9cTVBUllNRmI3bmM is what
> >> I'm thinking of.  It has 5 objects and a single page that is listed in
> >> the /Kids array 10 times.  Duplicating the page just means adding it
> >> to the array again and incrementing /Count.  If we want this document
> >> to work then there's really no minimum number of objects required for
> >> a page.  Otherwise, each page would require at least a /Page object.
> >> 
> >> FWIW Adobe Reader shows an error on the document after the first
> >> duplicated page.  Other viewers show it just fine.
> 
> _______________________________________________
> poppler mailing list
> poppler at lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/poppler



More information about the poppler mailing list