[poppler] Extract title from pdf file.
Leonard Rosenthol
lrosenth at adobe.com
Wed Nov 9 06:59:01 PST 2011
On 11/9/11 1:26 AM, "Alec Taylor" <alec.taylor6 at gmail.com> wrote:
>The easiest way I can think of is to grab it from the headers and footers.
>
>I am about to submit a patch (any day now) which separate the header
>and footers into separate tags from which you can access from
>pdftohtml -xml.
Are you also submitting patches to read & process any tags & structure in
the PDF? If the PDF is already tagged, then it will have any
headers/footers already identified accordingly. You should be using this
when present.
>I will then work on incorporating it all back into the PDF, with ToC
>linkage (I will make a new pdftopdf utility).
So are you also writing the structure back into the PDF?
Leonard
More information about the poppler
mailing list