[poppler] Hi All, I have a question about libpoppler and need your helps, thanks in advance.
Brad Hards
bradh at frogmouth.net
Tue Feb 21 01:09:01 PST 2012
On Tuesday 21 February 2012 12:20:36 Zhenbang Xi wrote:
> *I am developing a program using libpoppler to convert PDF to plain text.*
> *And I want to distinguish the page header and page footer from a page,in
> other words,I want to output them separately(including the main content).*
> *How can I do this? Is there any structure or class that hold them in
> memory?*
There is no way to identify this reliably - PDF (and hence poppler) doesn't
have any feature to interpret the intent of certain characters. It might be
possible to come up with a good heuristic for some documents, based on page
location.
Brad
More information about the poppler
mailing list