[poppler] XML syntax error in PdfToText tool

suzuki toshiya mpsuzuki at hiroshima-u.ac.jp
Thu Nov 14 05:31:40 PST 2013


Hi,

If you could post a sample XML file that you modified the
output of pdftotext to fit the XML parser, it would be
helpful for some kind people to develop a patch.

Regards,
mpsuzuki

On 11/14/2013 10:04 PM, Paweł Leń wrote:
> Hello,
>
> I have error when running:
> pdftotext -bbox -htmlmeta 'myfile.pdf' 'tempFile.xml'
>
> The output xml have <title> tag on the begining of document (meta section), error appears when title contains "&" character. Title field has no CDATA and it is not quoted so it causes error in my xmllib parser. Can I (or You :) ) fix it somehow?
>
> Beast regards
>
> *--
> *
>
> *Paweł Leń*
>
>
>
> _______________________________________________
> poppler mailing list
> poppler at lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/poppler
>



More information about the poppler mailing list