[poppler] About possibility to create a TOC for PDF documents

Adrian Perez de Castro aperez at igalia.com
Mon Sep 23 15:49:15 UTC 2019


On Sun, 15 Sep 2019 20:15:35 +0300, Andy Sardina <andysardina22 at gmail.com> wrote:
> Hi everyone,
> 
>     I sometimes get PDF files that do not contain TOC. I really like that
> in Foxit Reader you can create the TOC and I would like to have the same
> functionality in Okular. I have been looking at the source code of Poppler
> and I couldn't find a function to set an Outline object. Is anybody working
> on it? I would like to contribute to it.

For tagged PDFs you can probably get a very good TOC by fetching the
document's structure tree and traversing it to extract the interesting parts.
With the GLib API you can use PopplerStructureElement [1] to inspect the
structure of the document and pick the heading elements.

Beware that not all PDFs are tagged, so this won't work for every document
out there. For non-tagged documents, I suppose Foxit uses some heuristics
to guess which text elements are section headings.

I hope this helps.

Cheers,
—Adrián

---
[1] https://poppler.freedesktop.org/api/glib/PopplerStructureElement.html
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 195 bytes
Desc: not available
URL: <https://lists.freedesktop.org/archives/poppler/attachments/20190923/f0d4193c/attachment.sig>


More information about the poppler mailing list