[poppler] Extract pdf
amitcs06 at gmail.com
Thu Jan 28 05:11:43 PST 2010
ahh gud ,, so is there any way we can get these optional info ?
On Thu, Jan 28, 2010 at 6:19 PM, Leonard Rosenthol <lrosenth at adobe.com>wrote:
> PDF DOES support rich semantic structure including all of things listed
> below (ISO 32000-1:2008, 14.7, 14.8 and 14.9). HOWEVER, it is optional and
> therefore many PDF documents do not contain the necessary elements. And,
> as pointed out, without the presence of such elements already in the PDF -
> the best you can do is GUESS.
> -----Original Message-----
> From: poppler-bounces at lists.freedesktop.org [mailto:
> poppler-bounces at lists.freedesktop.org] On Behalf Of
> mpsuzuki at hiroshima-u.ac.jp
> Sent: Thursday, January 28, 2010 7:04 AM
> To: amit aggarwal
> Cc: poppler at lists.freedesktop.org
> Subject: Re: [poppler] Extract pdf
> I think PDF is a page description language and defines
> nothing for semantic structure; how to store the titles
> of section, subsection, figure and tables. Therfore, I
> guess, poppler cannot extract - because, PDF does not have.
> Is there any reliable framework defining such and your
> target documentations follow?
> On Thu, 28 Jan 2010 17:23:17 +0530
> amit aggarwal <amitcs06 at gmail.com> wrote:
> >Hi All,
> >I want to extract the following inforamaton for pdf
> >1) All Chapter Section and Subsection titles,
> >2) name of the Figures and tables
> >Can any one plz help me for the same ?
> >Amit Aggarwal
> poppler mailing list
> poppler at lists.freedesktop.org
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the poppler