Hello, I've noticed that pdftohtml misses the text content for some pdf files, while pdftotext can extract the text. Here is an example: http://www.umsu.de/temp/1966percepts.pdf#. (The latest version I've tried it with is poppler-0.26.0, compiled from source.) Is this a known problem? Are there any workarounds? Best, Wolfgang