[poppler] hello and a question about HtmlOutputDev
Albert Astals Cid
aacid at kde.org
Sat Jun 10 09:29:50 PDT 2006
A Divendres 09 Juny 2006 22:36, Jauco Noordzij va escriure:
> Hello everybody!
Hi
> I just subscribed to this list so for starters I'd like to give a little
> introduction:
> I'm a 23 year old student from holland. I'm working on a pdf input plugin
> for abiword and have been looking at your library because it seemed to fit
> the bill perfectly. I have been working with it for a few days and I like
> it
>
> :)
>
> Because I need to convert a pdf to another rich format I need something
> that returns richer information than the TextOutputDev. Your current
> HtmlOutputDev seems to work, but the headers are not public and it seems to
> have been removed completely in cvs HEAD. In my local codebase I made a
> public version and compiled the plugin against it. But before I continue
> I'd like to know your stand on the different outputdevs. Did you remove the
> HtmlOutputDev because it was unmaintained and buggy? or because you hold
> the opinion that pdf shouldn't be converted? or some other reason?
Hmm, HtmlOuputDev is still there AFAIK
>
> If you would allow me to work on it, this is what I propose:
> * I build an XMLOutputDev based on the html one.
> * This is a public ouputdev (like the text one)
> * It creates an xml representation of the pdf (sort off like the current
> pdftohtml -xml)
> * If you want I'll add a stylesheet to convert the xml to html.
>
> I think that a conversion to xml can be valuable, because the format is
> much clearer to work with, can be transformed using a variety of languages
> (like XSLT) and, most important, because it would allow external people to
> process pdf information without having to write a custom outputdev (which,
> I heard, you don't like). But you would have thought of that yourselves as
> well, so I am probably missing something :)
As Brad suggested an ODF outputdev would rule :-)
Albert
More information about the poppler
mailing list