[Poppler-bugs] [Bug 62266] New: [PATCH] try to detect line breaks in the PDF and insert them in raw mode for pdftotext
bugzilla-daemon at freedesktop.org
bugzilla-daemon at freedesktop.org
Tue Mar 12 15:42:05 PDT 2013
https://bugs.freedesktop.org/show_bug.cgi?id=62266
Priority: medium
Bug ID: 62266
Assignee: poppler-bugs at lists.freedesktop.org
Summary: [PATCH] try to detect line breaks in the PDF and
insert them in raw mode for pdftotext
Severity: enhancement
Classification: Unclassified
OS: All
Reporter: jamslam at gmail.com
Hardware: All
Status: NEW
Version: unspecified
Component: utils
Product: poppler
Created attachment 76449
--> https://bugs.freedesktop.org/attachment.cgi?id=76449&action=edit
Adds parabrk option to pdftotext
Adds the parabrk option to `pdftotext`.
The parabrk option is only applicable to raw mode, and attempts to insert an
additional new line character wherever one can be detected in the PDF. It is
intended to separate paragraphs when they are separated by vertical whitespace
in the PDF.
It isn't perfect, for instance, it doesn't handle page boundaries.
--
You are receiving this mail because:
You are the assignee for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.freedesktop.org/archives/poppler-bugs/attachments/20130312/2b00a4f2/attachment.html>
More information about the Poppler-bugs
mailing list