<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:x="urn:schemas-microsoft-com:office:excel" xmlns:p="urn:schemas-microsoft-com:office:powerpoint" xmlns:a="urn:schemas-microsoft-com:office:access" xmlns:dt="uuid:C2F41010-65B3-11d1-A29F-00AA00C14882" xmlns:s="uuid:BDC6E3F0-6DA3-11d1-A2A3-00AA00C14882" xmlns:rs="urn:schemas-microsoft-com:rowset" xmlns:z="#RowsetSchema" xmlns:b="urn:schemas-microsoft-com:office:publisher" xmlns:ss="urn:schemas-microsoft-com:office:spreadsheet" xmlns:c="urn:schemas-microsoft-com:office:component:spreadsheet" xmlns:odc="urn:schemas-microsoft-com:office:odc" xmlns:oa="urn:schemas-microsoft-com:office:activation" xmlns:html="http://www.w3.org/TR/REC-html40" xmlns:q="http://schemas.xmlsoap.org/soap/envelope/" xmlns:rtc="http://microsoft.com/officenet/conferencing" xmlns:D="DAV:" xmlns:Repl="http://schemas.microsoft.com/repl/" xmlns:mt="http://schemas.microsoft.com/sharepoint/soap/meetings/" xmlns:x2="http://schemas.microsoft.com/office/excel/2003/xml" xmlns:ppda="http://www.passport.com/NameSpace.xsd" xmlns:ois="http://schemas.microsoft.com/sharepoint/soap/ois/" xmlns:dir="http://schemas.microsoft.com/sharepoint/soap/directory/" xmlns:ds="http://www.w3.org/2000/09/xmldsig#" xmlns:dsp="http://schemas.microsoft.com/sharepoint/dsp" xmlns:udc="http://schemas.microsoft.com/data/udc" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:sub="http://schemas.microsoft.com/sharepoint/soap/2002/1/alerts/" xmlns:ec="http://www.w3.org/2001/04/xmlenc#" xmlns:sp="http://schemas.microsoft.com/sharepoint/" xmlns:sps="http://schemas.microsoft.com/sharepoint/soap/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:udcs="http://schemas.microsoft.com/data/udc/soap" xmlns:udcxf="http://schemas.microsoft.com/data/udc/xmlfile" xmlns:udcp2p="http://schemas.microsoft.com/data/udc/parttopart" xmlns:wf="http://schemas.microsoft.com/sharepoint/soap/workflow/" xmlns:dsss="http://schemas.microsoft.com/office/2006/digsig-setup" xmlns:dssi="http://schemas.microsoft.com/office/2006/digsig" xmlns:mdssi="http://schemas.openxmlformats.org/package/2006/digital-signature" xmlns:mver="http://schemas.openxmlformats.org/markup-compatibility/2006" xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" xmlns:mrels="http://schemas.openxmlformats.org/package/2006/relationships" xmlns:spwp="http://microsoft.com/sharepoint/webpartpages" xmlns:ex12t="http://schemas.microsoft.com/exchange/services/2006/types" xmlns:ex12m="http://schemas.microsoft.com/exchange/services/2006/messages" xmlns:pptsl="http://schemas.microsoft.com/sharepoint/soap/SlideLibrary/" xmlns:spsl="http://microsoft.com/webservices/SharePointPortalServer/PublishedLinksService" xmlns:Z="urn:schemas-microsoft-com:" xmlns:st="" xmlns="http://www.w3.org/TR/REC-html40">
<head>
<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=us-ascii">
<meta name=Generator content="Microsoft Word 12 (filtered medium)">
<style>
<!--
/* Font Definitions */
@font-face
        {font-family:"Cambria Math";
        panose-1:2 4 5 3 5 4 6 3 2 4;}
@font-face
        {font-family:Calibri;
        panose-1:2 15 5 2 2 2 4 3 2 4;}
@font-face
        {font-family:Tahoma;
        panose-1:2 11 6 4 3 5 4 4 2 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
        {margin:0in;
        margin-bottom:.0001pt;
        font-size:12.0pt;
        font-family:"Times New Roman","serif";}
a:link, span.MsoHyperlink
        {mso-style-priority:99;
        color:blue;
        text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
        {mso-style-priority:99;
        color:purple;
        text-decoration:underline;}
span.EmailStyle17
        {mso-style-type:personal-reply;
        font-family:"Calibri","sans-serif";
        color:#1F497D;}
.MsoChpDefault
        {mso-style-type:export-only;}
@page Section1
        {size:8.5in 11.0in;
        margin:1.0in 1.0in 1.0in 1.0in;}
div.Section1
        {page:Section1;}
-->
</style>
<!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]-->
</head>
<body lang=EN-US link=blue vlink=purple>
<div class=Section1>
<p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";
color:#1F497D'>Poppler doesn’t expose the necessary “low level” APIs that you
would need to get access to them, if present. You’ll need to get down to the
original Xpdf Object classes – and to use them properly you will also need an
indepth understanding of the PDF format and the relevant sections of the
documention.<o:p></o:p></span></p>
<p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";
color:#1F497D'><o:p> </o:p></span></p>
<p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";
color:#1F497D'>But yes, it is possible.<o:p></o:p></span></p>
<p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";
color:#1F497D'><o:p> </o:p></span></p>
<p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";
color:#1F497D'>Leonard<o:p></o:p></span></p>
<p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";
color:#1F497D'><o:p> </o:p></span></p>
<div style='border:none;border-top:solid #B5C4DF 1.0pt;padding:3.0pt 0in 0in 0in'>
<p class=MsoNormal><b><span style='font-size:10.0pt;font-family:"Tahoma","sans-serif"'>From:</span></b><span
style='font-size:10.0pt;font-family:"Tahoma","sans-serif"'> amit aggarwal
[mailto:amitcs06@gmail.com] <br>
<b>Sent:</b> Thursday, January 28, 2010 8:12 AM<br>
<b>To:</b> Leonard Rosenthol<br>
<b>Cc:</b> mpsuzuki@hiroshima-u.ac.jp; poppler@lists.freedesktop.org<br>
<b>Subject:</b> Re: [poppler] Extract pdf<o:p></o:p></span></p>
</div>
<p class=MsoNormal><o:p> </o:p></p>
<p class=MsoNormal style='margin-bottom:12.0pt'><br>
ahh gud ,, so is there any way we can get these optional info ? <o:p></o:p></p>
<div>
<p class=MsoNormal>On Thu, Jan 28, 2010 at 6:19 PM, Leonard Rosenthol <<a
href="mailto:lrosenth@adobe.com">lrosenth@adobe.com</a>> wrote:<o:p></o:p></p>
<p class=MsoNormal>PDF DOES support rich semantic structure including all of
things listed below (ISO 32000-1:2008, 14.7, 14.8 and 14.9). HOWEVER, it is
optional and therefore many PDF documents do not contain the necessary
elements. And, as pointed out, without the presence of such elements
already in the PDF - the best you can do is GUESS.<o:p></o:p></p>
<div>
<div>
<p class=MsoNormal><br>
-----Original Message-----<br>
From: <a href="mailto:poppler-bounces@lists.freedesktop.org">poppler-bounces@lists.freedesktop.org</a>
[mailto:<a href="mailto:poppler-bounces@lists.freedesktop.org">poppler-bounces@lists.freedesktop.org</a>]
On Behalf Of <a href="mailto:mpsuzuki@hiroshima-u.ac.jp">mpsuzuki@hiroshima-u.ac.jp</a><br>
Sent: Thursday, January 28, 2010 7:04 AM<br>
To: amit aggarwal<br>
Cc: <a href="mailto:poppler@lists.freedesktop.org">poppler@lists.freedesktop.org</a><br>
Subject: Re: [poppler] Extract pdf<br>
<br>
Hi,<br>
<br>
I think PDF is a page description language and defines<br>
nothing for semantic structure; how to store the titles<br>
of section, subsection, figure and tables. Therfore, I<br>
guess, poppler cannot extract - because, PDF does not have.<br>
<br>
Is there any reliable framework defining such and your<br>
target documentations follow?<br>
<br>
Regards,<br>
mpsuzuki<br>
<br>
On Thu, 28 Jan 2010 17:23:17 +0530<br>
amit aggarwal <<a href="mailto:amitcs06@gmail.com">amitcs06@gmail.com</a>>
wrote:<br>
<br>
>Hi All,<br>
><br>
>I want to extract the following inforamaton for pdf<br>
>1) All Chapter Section and Subsection titles,<br>
>2) name of the Figures and tables<br>
><br>
>Can any one plz help me for the same ?<br>
><br>
>--<br>
>Thanks<br>
>Amit Aggarwal<br>
><o:p></o:p></p>
</div>
</div>
<p class=MsoNormal>_______________________________________________<br>
poppler mailing list<br>
<a href="mailto:poppler@lists.freedesktop.org">poppler@lists.freedesktop.org</a><br>
<a href="http://lists.freedesktop.org/mailman/listinfo/poppler" target="_blank">http://lists.freedesktop.org/mailman/listinfo/poppler</a><o:p></o:p></p>
</div>
<p class=MsoNormal style='margin-bottom:12.0pt'><br>
<br clear=all>
<br>
-- <br>
Thanks<br>
Amit Aggarwal<o:p></o:p></p>
</div>
</body>
</html>