[Poppler-bugs] [Bug 94518] New: raw_order_layout completely broken
bugzilla-daemon at freedesktop.org
bugzilla-daemon at freedesktop.org
Sat Mar 12 20:47:45 UTC 2016
https://bugs.freedesktop.org/show_bug.cgi?id=94518
Bug ID: 94518
Summary: raw_order_layout completely broken
Product: poppler
Version: unspecified
Hardware: Other
OS: All
Status: NEW
Severity: normal
Priority: medium
Component: cpp frontend
Assignee: poppler-bugs at lists.freedesktop.org
Reporter: jeroen.ooms at stat.ucla.edu
See also: https://lists.freedesktop.org/archives/poppler/2016-March/011727.html
Extracting text with raw_order_layout gives malformed and random output (no
text at all for most pages):
ustring str = p->text(p->page_rect(), page::raw_order_layout);
An example:
- source: http://arxiv.org/pdf/1403.2805.pdf
- pdftotext default output: http://pastebin.com/raw/A93xPT4j
- cpp with page::physical_layout: http://pastebin.com/raw/MZFpTRbD
- cpp with page::raw_order_layout http://pastebin.com/raw/n8dcsqkZ
Output misses most text, has no spaces, etc. Also each time I run it, I get
different results so it looks like there is a memory bug.
--
You are receiving this mail because:
You are the assignee for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/poppler-bugs/attachments/20160312/e6192a77/attachment.html>
More information about the Poppler-bugs
mailing list