[Poppler-bugs] [Bug 56293] New: Incorrect positioning of text in PDFTOHTML
bugzilla-daemon at freedesktop.org
bugzilla-daemon at freedesktop.org
Mon Oct 22 13:17:34 PDT 2012
https://bugs.freedesktop.org/show_bug.cgi?id=56293
Priority: medium
Bug ID: 56293
Assignee: poppler-bugs at lists.freedesktop.org
Summary: Incorrect positioning of text in PDFTOHTML
Severity: normal
Classification: Unclassified
OS: All
Reporter: erik.engstroem at gmail.com
Hardware: Other
Status: NEW
Version: unspecified
Component: pdftohtml
Product: poppler
Created attachment 68923
--> https://bugs.freedesktop.org/attachment.cgi?id=68923&action=edit
pdf file inhibiting this behavior
PDFTOHTML converts text positions on certain PDF documents incorrect. Attached
is a document in which this happens.
The following logic explains this further:
The size of an image of the first page is 1024x1408. The text "Brief article"
which can be seen highlighted should be positioned 19% from the top as seen
here:
http://imageshack.us/a/img526/6343/textshiftedpdf1.png
Poppler outputs this text with the following data when using pdftohtml -xml
<text top="409" left="447" width="80" height="15" font="0">Brief article</text>
The dimensions of this page according to poppler taken from the same xml file:
<page number="1" position="absolute" top="0" left="0" height="1488"
width="1063">
This would give us that the text should be according to poppler be positioned:
409/1488=0.27=27% which is clearly wrong.
No other warning messages or errors were noted when converting this document
--
You are receiving this mail because:
You are the assignee for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.freedesktop.org/archives/poppler-bugs/attachments/20121022/7d0e8ef8/attachment.html>
More information about the Poppler-bugs
mailing list