[poppler] PDF editing operations

Shawn Rutledge shawn.t.rutledge at gmail.com
Tue May 19 14:53:07 PDT 2009


Is there any plan to support some basic editing operations, some of
which pdftk can do, like rearranging page order, renumbering pages,
editing metadata or OCR text inside the PDF?  I saw in the Qt4 binding
documentation that it's possible to write an open PDF document as a
new PDF, and there is a flag to preserve changes or not, but what are
the changes that it supports?

I'm scanning a bunch of old magazines that take up too much space in
boxes (Radio-Electronics, Popular Science etc.) and was thinking of
writing a program to recognize the name and date of each scan (look
for the known magazine titles, month names etc. in the margins), and
auto-number the pages (look for page numbers in known likely
locations).  I confirmed that GOCR is good enough to extract page
numbers from page images.  I could probably just use pdftk to do the
renumbering, but also thought of making a better integrated tool
rather than just a scripting solution.

I actually bought Acrobat to do OCR on the scans, but it does not have
such features.


More information about the poppler mailing list