[Poppler-bugs] [Bug 94504] New: pdftotext and pdftohtml fails to extract columns
bugzilla-daemon at freedesktop.org
bugzilla-daemon at freedesktop.org
Fri Mar 11 22:13:29 UTC 2016
https://bugs.freedesktop.org/show_bug.cgi?id=94504
Bug ID: 94504
Summary: pdftotext and pdftohtml fails to extract columns
Product: poppler
Version: unspecified
Hardware: x86 (IA32)
OS: Linux (All)
Status: NEW
Severity: normal
Priority: medium
Component: utils
Assignee: poppler-bugs at lists.freedesktop.org
Reporter: john at hovedpuden.dk
Created attachment 122241
--> https://bugs.freedesktop.org/attachment.cgi?id=122241&action=edit
PDF-filewith columns that is not processed corerctly
pdftotext and pdftohtml fail to correctly process certain PDF pages with three
columns.
For the attached PDF-file the error occurs on page 5 where the rendered text is
not in correct order.
Rendered text (XXXX represents social security numbers in the file. The
rendered text is correctly with 4 digits):
S08032016-17
Alle og enhver, der har noget til gode
i nedennævnte bo, indkaldes herved
til at anmelde og dokumentere deres
krav inden 8 uger
S08032016-21
Alle og enhver, der har noget til gode
i nedennævnte bo, indkaldes herved
til at anmelde og dokumentere deres
krav inden 8 uger
S08032016-26
Alle og enhver, der har noget til gode
i nedennævnte bo, indkaldes herved
til at anmelde og dokumentere deres
krav inden 8 uger
Afdøde
Cpr.nr. 190521XXXX
Dødsdato 11.02.2016
Frede Jensen
Hyldevej 12
9300 Sæby
Afdøde
Cpr.nr. 150733XXXX
Dødsdato 04.01.2016
Inger Kathrine Simonsen
Gl. Tingvej 40F, 1 th.
9600 Aars
Afdøde
Cpr.nr. 300121XXXX
Dødsdato 26.01.2016
Anna Hartlev
Gulkrog 16, st
7100 Vejle
--
You are receiving this mail because:
You are the assignee for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/poppler-bugs/attachments/20160311/a51677c8/attachment.html>
More information about the Poppler-bugs
mailing list