[poppler] Extract pdf

mpsuzuki at hiroshima-u.ac.jp mpsuzuki at hiroshima-u.ac.jp
Thu Jan 28 04:03:43 PST 2010


Hi,

I think PDF is a page description language and defines
nothing for semantic structure; how to store the titles
of section, subsection, figure and tables. Therfore, I
guess, poppler cannot extract - because, PDF does not have.

Is there any reliable framework defining such and your
target documentations follow?

Regards,
mpsuzuki

On Thu, 28 Jan 2010 17:23:17 +0530
amit aggarwal <amitcs06 at gmail.com> wrote:

>Hi All,
>
>I want to extract the following inforamaton for pdf
>1) All Chapter Section and Subsection titles,
>2)  name of the Figures and tables
>
>Can any one plz help me for the same ?
>
>-- 
>Thanks
>Amit Aggarwal
>


More information about the poppler mailing list