[poppler] commit? bounding box html in pdftotext
Albert Astals Cid
aacid at kde.org
Sun Jul 11 10:49:24 PDT 2010
A Dimarts, 6 de juliol de 2010, Kenneth Berland va escriure:
> Can I use std::string within any GooString methods I write (e.g. replace)
> or am I limited to the C Standard library (i.e. string.h)?
No std:: usage anywhere in poppler (except in the cpp frontend).
Albert
>
> -KB
>
> On Tue, 8 Jun 2010, Albert Astals Cid wrote:
> > A Dimarts, 8 de juny de 2010, vàreu escriure:
> >> Does GooString have a replace() method? I could not find one. Does
> >> this mean I should write one?
> >
> > Yes, you'll have to write one or get the char * from the GooString and
> > use c- string ones.
> >
> > Albert
> >
> >> -KB
> >>
> >> On Sun, 30 May 2010, Albert Astals Cid wrote:
> >> > A Diumenge, 30 de maig de 2010, Kenneth Berland va escriure:
> >> >> 1) Since I sent my last diff, I've:
> >> >> a) added some string processing to make sure no HTML reserved
> >> >>
> >> >> characters are placed into the output. I process each word.
> >> >>
> >> >> b) altered the html a bit so that XML parsers can deal with it.
> >> >>
> >> >> I've put in a title tag or an empty title tag and added end tags to
> >> >> the meta tags.
> >> >>
> >> >> 2) Addressing your concerns:
> >> >> a) I've removed the initialization of stdout.
> >> >>
> >> >> b) I close f now and reopen it. This also removes the warning.
> >> >>
> >> >> c) If a user is running with the -bbox option, they want word
> >> >>
> >> >> bounding boxes. If there are no words, I think a line to stderr is
> >> >> appropriate.
> >> >
> >> > Cool, though we try not to use the std (yeah it sucks i know), can you
> >> > either use GooString or char *?
> >> >
> >> >
> >> > Thanks,
> >> >
> >> > Albert
> >> >
> >> >> -KB
> >> >>
> >> >> On Wed, 26 May 2010, Albert Astals Cid wrote:
> >> >>> A Dimecres, 26 de maig de 2010, Kenneth Berland va escriure:
> >> >>>> I get a compiler warning without it.
> >> >>>>
> >> >>>> pdftotext.cc: In function ‘int main(int, char**)’:
> >> >>>> pdftotext.cc:164: warning: ‘f’ may be used uninitialized in this
> >> >>>> function
> >> >>>
> >> >>> That change will not get accepted, sorry, initializing f to stdout
> >> >>> is not a solution.
> >> >>>
> >> >>> Also i do not like the fact that you do not close f if you are
> >> >>> writing the bbox? Can't you just open it again like the code
> >> >>> already does?
> >> >>>
> >> >>> Also i do not understand why the code considers a page having no
> >> >>> text an error.
> >> >>>
> >> >>> Albert
> >> >>>
> >> >>>> -KB
> >> >>>>
> >> >>>> On Wed, 26 May 2010, Albert Astals Cid wrote:
> >> >>>>> A Diumenge, 9 de maig de 2010, Kenneth Berland va escriure:
> >> >>>>>> List,
> >> >>>>>>
> >> >>>>>> I've attached a small addition to pdftotext that outputs bounding
> >> >>>>>> box information to html like this:
> >> >>>>>>
> >> >>>>>> <doc>
> >> >>>>>>
> >> >>>>>> <page width="612.000000" height="792.000000"/>
> >> >>>>>>
> >> >>>>>> <word xMin="56.800000" yMin="57.208000" xMax="75.412000"
> >> >>>>>>
> >> >>>>>> yMax="70.492000">The</word> </page>
> >> >>>>>> </doc>
> >> >>>>>>
> >> >>>>>> I had a need, maybe others will too.
> >> >>>>>>
> >> >>>>>> -KB
> >> >>>>>
> >> >>>>> Why is this change necessary?
> >> >>>>>
> >> >>>>> - FILE *f;
> >> >>>>> + FILE *f = stdout;
> >> >>>>>
> >> >>>>> Albert
> >> >>>
> >> >>> _______________________________________________
> >> >>> poppler mailing list
> >> >>> poppler at lists.freedesktop.org
> >> >>> http://lists.freedesktop.org/mailman/listinfo/poppler
> >
> > _______________________________________________
> > poppler mailing list
> > poppler at lists.freedesktop.org
> > http://lists.freedesktop.org/mailman/listinfo/poppler
More information about the poppler
mailing list