[poppler] Retrieve all objects from a PDF file

Nedim Srndic nedim.sh at gmail.com
Fri Nov 25 00:18:18 PST 2011


On Thu, 2011-11-24 at 18:48 +0100, Albert Astals Cid wrote:
> El Dijous, 24 de novembre de 2011, a les 09:46:51, vau escriure:
> > On Fri, 2011-11-04 at 13:15 +0100, Albert Astals Cid wrote:
> > > A Divendres, 4 de novembre de 2011, Nedim Srndic vàreu escriure:
> > > > What is the preferred way to retrieve an indirect object from an
> > > > object
> > > > stream?
> > > 
> > > ObjectStream::getObject ?
> > 
> > But the ObjectStream class is not publicly accessible.
> > 
> > If I know that there is an object with number X in an Object Stream, and
> > the XRef returns null when I query for it, is that a bug? If not, how
> > can I get it?
> 
> What about debuggint the code, this is a developers list, so if you face 
> something you think it is a bug, you debug it and if you do not have the 
> knowledge to debug it, you file a bug and give a pdf to test, but saying "XRef 
> returns null when I query for it" is not enough, you don't say which code you 
> use, you don't give a PDF, what do you expect us to do?
> 
> Albert

Yes, this is a developers list, but I didn't find a users mailing list
and did not want to use IRC because somebody may have the same question
later. I did not want to file a bug because most projects encourage
users to first discuss their problem before submitting a bug report. I
did say which code I use and I described the problem as best (and
shortest) as I could in the very first email, sent almost one month ago.
I will interpret your answer as an invitation to submit a bug report. I
hope Poppler gets useful documentation, typical usage examples and more
manpower in the future. 

Greetings, 
Nedim

> 
> > 
> > Greetings,
> > Nedim
> > 
> > > > Is it possible that I have found a bug? This is really important
> > > > for me.
> > > 
> > > Albert
> > > 
> > > > Nedim
> > > > 
> > > > On Wed, 2011-11-02 at 14:09 +0100, Nedim Srndic wrote:
> > > > > I tried out Poppler 13 from Ubuntu 11.10 and I get the same
> > > > > results. As far as I understand, if I look for an object in
> > > > > XRef using fetch(), and that object is in an object stream, the
> > > > > XRef then uncompresses the object and returns it to me, so that
> > > > > I don't even know that it was compressed in the first place? If
> > > > > things don't work this way, what approach should I take?
> > > > > 
> > > > > That being said, I tried this approach with both Poppler 7 and
> > > > > 13 and
> > > > > two PDF files with object streams. When I do an XRef->fetch()
> > > > > with
> > > > > generation number 0 and object number of an object in the object
> > > > > stream, I get a null object for all objects except the first
> > > > > one that is packed in the object stream. The first one isn't
> > > > > extracted fully. Is this a known issue?
> > > > > 
> > > > > Nedim
> > > > > 
> > > > > On Mon, 2011-10-31 at 11:12 -0700, Josh Richardson wrote:
> > > > > > What kinds of objects are you interested in?  I have a
> > > > > > version of
> > > > > > pdftohtml which I believe is not yet merged into the master
> > > > > > repo
> > > > > > that
> > > > > > extracts images and fonts.
> > > > > > 
> > > > > > --josh
> > > > > > 
> > > > > > On 10/31/11 9:16 AM, "Nedim Srndic" <nedim.sh at gmail.com> wrote:
> > > > > > >Dear list,
> > > > > > >
> > > > > > >I am using the Poppler library (in the src/poppler folder,
> > > > > > >no
> > > > > > >bindings, version 7 from the Ubuntu 10.10 repos) and would
> > > > > > >like
> > > > > > >to retrieve all objects from a PDF file. Currently, I am
> > > > > > >running
> > > > > > >a loop on XRef and getting all the non-null objects from
> > > > > > >it, but
> > > > > > >it doesn't seem to retrieve objects from object streams.
> > > > > > >What
> > > > > > >solution would you propose for this problem?
> > > > > > >
> > > > > > >Thanks,
> > > > > > >Nedim Srndic
> > > > > > >
> > > > > > >_______________________________________________
> > > > > > >poppler mailing list
> > > > > > >poppler at lists.freedesktop.org
> > > > > > >http://lists.freedesktop.org/mailman/listinfo/poppler
> > > > 
> > > > _______________________________________________
> > > > poppler mailing list
> > > > poppler at lists.freedesktop.org
> > > > http://lists.freedesktop.org/mailman/listinfo/poppler
> > > 
> > > _______________________________________________
> > > poppler mailing list
> > > poppler at lists.freedesktop.org
> > > http://lists.freedesktop.org/mailman/listinfo/poppler
> _______________________________________________
> poppler mailing list
> poppler at lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/poppler




More information about the poppler mailing list