[poppler] Thoughts on OO-ifying and modularising the pdftothml utility?

Josh Richardson jric at chegg.com
Sat Oct 29 00:00:48 PDT 2011


I think you may be mixing metaphors.  Poppler is the library to
query/write PDFs.  pdftohtml is a utility using that library using
specialized writers.

I personally think that the Poppler code could use a lot of cleanup and
documentation, however I think we're somewhat limited because we still
track the sizeable development efforts of the xpdf community.  I'm not
sure about the "modularization" you're looking for.  What are the use
cases that you would like to improve?

--josh

On 10/28/11 11:37 PM, "Alec Taylor" <alec.taylor6 at gmail.com> wrote:

>The current pdftohtml.cc is extremely in need of modularity. I'm
>considering making the utility more modular, to the point of making it
>object-orientated.
>
>int main(int argc, char *argv[])
>	    PDFtoHTML *foo = new PDFtoHTML;
>	    if(!foo->setArgs(argDesc, &argc, argv) {
>	    	    delete foo;
>	    	    return false;
>	    else {
>	    	    delete foo;
>	    	    return true;
>	    }
>}
>
>Public member functions could include:
>GBool PDFtoHTML::toXML(char passes);
>GBool PDFtoHTML::toHTML(GBool images, GBool complex);
>GBool PDFtoHTML::PDFinfo();
>GBool PDFtoHTML::removeRestrictions();
>
>&etc
>
>What do you think? - Worth doing? - Useful? - Too far?!
>



More information about the poppler mailing list