[poppler] Reverse-engineering an XML file generated by pdftohtml -xml back into the PDF?
alec.taylor6 at gmail.com
Mon Nov 14 22:42:28 PST 2011
How would I go about reverse-engineering an XML file generated by
pdftohtml -xml bak into the [same] PDF?
I have been spending a long time extending the XML output to include
proper page numbers and header/footer detection.
It would be extremely useful if I could push the additional logical
structure information and page numbers back into the PDF the XML was
How would I go about doing this?
Thanks for all suggestions,
PS: T-9 days (or less!) until PATCH :)
More information about the poppler