[poppler] pdftohtml doesn't find text in pdf, pdftotext does?

Wolfgang Schwarz wo at umsu.de
Tue May 20 07:17:33 PDT 2014


Hello,

I've noticed that pdftohtml misses the text content for some pdf
files, while pdftotext can extract the text. Here is an example:
http://www.umsu.de/temp/1966percepts.pdf#. (The latest version
I've tried it with is poppler-0.26.0, compiled from source.)

Is this a known problem? Are there any workarounds?

Best,
Wolfgang


More information about the poppler mailing list