[poppler] PdftoHtml - Overlapping Characters in Html Due to Missing/Incorrect "Letter-Spacing" attribute

Parul Srivastava parul009 at gmail.com
Thu Apr 5 05:40:24 PDT 2012


I am using poppler's pdftohtml converter version 0.17.2 to convert some pdf
docs to Html format. I realize this is an older version and the current
stable version is 0.18.4. However, due to certain reasons we are sticking
to this version.

The problem is that when it converts the pdf document to Html form, some
characters in the Html document are missing. The thing is that when I open
the Html document in the text editor, no text is missing but while
displaying, it displays "ration" as "raon". On adding the "letter-spacing
attribute to this, it appears absolutely correctly as "ration".

There is another problem. In some sub-headings, I see a huge negative value
for letter-spacing which causes the string to be reversed in the Html
display. For instance, "Co" appears as "oC".

Could anyone have an idea why this is happening.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.freedesktop.org/archives/poppler/attachments/20120405/f6d5b146/attachment.htm>

More information about the poppler mailing list