[poppler] improved ebook pdf handling
Randall Puljek-Shank
puljekshank at gmail.com
Sun Oct 4 22:05:54 PDT 2009
I'd like to improve the pdftohtml handling of ebooks. Here are the goals
that I have:
1. Recognize table of contents and convert to links
2. Remove running headers and page numbers from the resulting text
3. Recognize columns
I'm thinking that each of these could be separate switches. Anybody who is
interested to help is welcome of course, or pointers to similar code.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.freedesktop.org/archives/poppler/attachments/20091005/096dce1d/attachment.htm
More information about the poppler
mailing list