[poppler] Thoughts on OO-ifying and modularising the pdftothml utility?

Alec Taylor alec.taylor6 at gmail.com
Sat Oct 29 01:52:55 PDT 2011


Just making it easier for developers to integrate pdftohtml features
into there code; as well as general improvements to the pdftohtml
codebase.

It could also serve as "killer-app" type reference for making the xpdf
and poppler libraries more modular.

On Sat, Oct 29, 2011 at 6:00 PM, Josh Richardson <jric at chegg.com> wrote:
> I think you may be mixing metaphors.  Poppler is the library to
> query/write PDFs.  pdftohtml is a utility using that library using
> specialized writers.
>
> I personally think that the Poppler code could use a lot of cleanup and
> documentation, however I think we're somewhat limited because we still
> track the sizeable development efforts of the xpdf community.  I'm not
> sure about the "modularization" you're looking for.  What are the use
> cases that you would like to improve?
>
> --josh
>
> On 10/28/11 11:37 PM, "Alec Taylor" <alec.taylor6 at gmail.com> wrote:
>
>>The current pdftohtml.cc is extremely in need of modularity. I'm
>>considering making the utility more modular, to the point of making it
>>object-orientated.
>>
>>int main(int argc, char *argv[])
>>           PDFtoHTML *foo = new PDFtoHTML;
>>           if(!foo->setArgs(argDesc, &argc, argv) {
>>                   delete foo;
>>                   return false;
>>           else {
>>                   delete foo;
>>                   return true;
>>           }
>>}
>>
>>Public member functions could include:
>>GBool PDFtoHTML::toXML(char passes);
>>GBool PDFtoHTML::toHTML(GBool images, GBool complex);
>>GBool PDFtoHTML::PDFinfo();
>>GBool PDFtoHTML::removeRestrictions();
>>
>>&etc
>>
>>What do you think? - Worth doing? - Useful? - Too far?!
>>
>
>


More information about the poppler mailing list