[poppler] Combine bounding box data and tiff to create pdf?

Mark Ehle markehle at gmail.com
Wed May 7 17:27:52 PDT 2014


Folks -

I am using pdtotxt to extract text from pdf file in a digital newspaper
archive I am creating for a local public library. So far, it's working
great. But - I am using up a far amount of disk space and would like to
figure out a way to create an OCR'd pdf from an image and the bounding box
data. That way I would not have to store the PDF files as well as the
images. Is there a way to do that?

Thanks -

Mark
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.freedesktop.org/archives/poppler/attachments/20140507/2643989b/attachment.html>


More information about the poppler mailing list