[poppler] Extract title from pdf file.

Leonard Rosenthol lrosenth at adobe.com
Wed Nov 9 06:59:01 PST 2011


On 11/9/11 1:26 AM, "Alec Taylor" <alec.taylor6 at gmail.com> wrote:

>The easiest way I can think of is to grab it from the headers and footers.
>
>I am about to submit a patch (any day now) which separate the header
>and footers into separate tags from which you can access from
>pdftohtml -xml.

Are you also submitting patches to read & process any tags & structure in
the PDF?  If the PDF is already tagged, then it will have any
headers/footers already identified accordingly.  You should be using this
when present.


>I will then work on incorporating it all back into the PDF, with ToC
>linkage (I will make a new pdftopdf utility).

So are you also writing the structure back into the PDF?

Leonard



More information about the poppler mailing list