[poppler] Reading Meta Information from PDF

Brad Hards bradh at frogmouth.net
Sat Apr 10 18:42:48 PDT 2010


On Thursday 08 April 2010 08:35:15 pm Mathieu Malaterre wrote:
>   This is slightly of topic to poppler. I am looking for a way to read
> the Meta Information of a PDF file (basically the output of pdfinfo).
This isn't a lot of context to work with, so I'm guessing what might work for 
you.
> I find it a little bit cumbersome to integrate poppler (license issue,
> no real need for a full rendering PDF library). Could someone suggest
> another solution for reading those Meta Information from PDF files ?
If you don't want to use poppler / pdfinfo, you could buy the adobe libraries, 
or you could try pdftk. Podofo may also be a possibility. 

>   Will a simple regex (such as: "<rdf:RDF.*</rdf:RDF>)") works ?
I do not think this will work in general. It might work for all the PDF files 
you care about though. Read the PDF specification (Section 10.2.2 or 
thereabouts) for information on the metadata stream(s).

Brad


More information about the poppler mailing list