[poppler] Reverse-engineering an XML file generated by pdftohtml -xml back into the PDF?

Alec Taylor alec.taylor6 at gmail.com
Mon Nov 14 22:42:28 PST 2011


Good afternoon,

How would I go about reverse-engineering an XML file generated by
pdftohtml -xml bak into the [same] PDF?

I have been spending a long time extending the XML output to include
proper page numbers and header/footer detection.

It would be extremely useful if I could push the additional logical
structure information and page numbers back into the PDF the XML was
generated from.

How would I go about doing this?

Thanks for all suggestions,

Alec Taylor

PS: T-9 days (or less!) until PATCH :)


More information about the poppler mailing list