[Poppler-bugs] [Bug 89941] New: pdftotext: Add an option for more detailed bounding box information
bugzilla-daemon at freedesktop.org
bugzilla-daemon at freedesktop.org
Tue Apr 7 10:58:15 PDT 2015
https://bugs.freedesktop.org/show_bug.cgi?id=89941
Bug ID: 89941
Summary: pdftotext: Add an option for more detailed bounding
box information
Product: poppler
Version: unspecified
Hardware: All
OS: All
Status: NEW
Severity: enhancement
Priority: medium
Component: utils
Assignee: poppler-bugs at lists.freedesktop.org
Reporter: jechols at uoregon.edu
Created attachment 114932
--> https://bugs.freedesktop.org/attachment.cgi?id=114932&action=edit
Adds -bbox-layout command to pdftotext
We're looking to generate ALTO-compatible XML
(http://en.wikipedia.org/wiki/ALTO_%28XML%29) from PDFs, and the current -bbox
flag almost does what we need, but skips over some important data - blocks and
lines.
I have created some code based on 0.22.5 (in order to ensure compatibility on
our CentOS 7 system) which appears to apply cleanly to the current master, and
produces the same output as my 0.22.5 hack as far as I can tell. The change
adds a new flag, -bbox-layout, which is still very generic output, but is
sufficient for us to then transform as needed.
--
You are receiving this mail because:
You are the assignee for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.freedesktop.org/archives/poppler-bugs/attachments/20150407/78be4b0a/attachment-0001.html>
More information about the Poppler-bugs
mailing list