[Poppler-bugs] [Bug 101807] pdftohtml: fakebold and dropshadow duplicated text

bugzilla-daemon at freedesktop.org bugzilla-daemon at freedesktop.org
Sun Jul 16 19:39:15 UTC 2017


https://bugs.freedesktop.org/show_bug.cgi?id=101807

--- Comment #1 from Jason Crain <jason at inspiresomeone.us> ---
Created attachment 132719
  --> https://bugs.freedesktop.org/attachment.cgi?id=132719&action=edit
pdf-example.html - from pdftohtml -s -noframes

I've attached the HTML file resulting from running "pdftohtml -s -noframes
pdf-example.pdf".  I haven't attached the images but this should be enough to
get the idea.  The HTML has several places where lines are duplicated. 
Example:

<p style="..." class="ft10">1 </p>
<p style="..." class="ft11"> </p>
<p style="..." class="ft12">UPUTSTVO ZA PACIJENTA</p>
<p style="..." class="ft12">UPUTSTVO ZA PACIJENT</p>
<p style="..." class="ft12">UPUTSTVO ZA PACIJEN</p>
<p style="..." class="ft12">UPUTSTVO ZA PACIJE</p>
<p style="..." class="ft12"> </p>
<p style="..." class="ft12"> LUNATA</p>
<p style="..." class="ft12">LUNAT</p>
<p style="..." class="ft12">LUNA</p>
<p style="..." class="ft12">LUN</p>

I've elided the style attribute above to keep the lines a reasonable length.

-- 
You are receiving this mail because:
You are the assignee for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/poppler-bugs/attachments/20170716/5bca03fc/attachment.html>


More information about the Poppler-bugs mailing list