[poppler] commit? bounding box html in pdftotext

Albert Astals Cid aacid at kde.org
Wed Sep 22 11:21:35 PDT 2010


A Dimecres, 22 de setembre de 2010, Kenneth Berland va escriure:
> I have rewritten the replace function with standard C.

Now is when you hate me but since a few weeks we accept std:: code if it's 
*obvious* it adds value over existing code.

So can you please send again your patch to the mailing list?

Sorry, totally forgot to tell you.

Albert

> 
> -KB
> 
> On Sun, 11 Jul 2010, Albert Astals Cid wrote:
> > A Dimarts, 6 de juliol de 2010, Kenneth Berland va escriure:
> >> Can I use std::string within any GooString methods I write (e.g.
> >> replace) or am I limited to the C Standard library (i.e. string.h)?
> > 
> > No std:: usage anywhere in poppler (except in the cpp frontend).
> > 
> > Albert
> > 
> > On Mon, 5 Jul 2010, Kenneth Berland wrote:
> >> Can I use std::string within any GooString methods I write (e.g.
> >> replace) or am I limited to the C Standard library (i.e. string.h)?
> >> 
> >> -KB
> >> 
> >> On Tue, 8 Jun 2010, Albert Astals Cid wrote:
> >>> A Dimarts, 8 de juny de 2010, vàreu escriure:
> >>>> Does GooString have a replace() method?  I could not find one.  Does
> >>>> this mean I should write one?
> >>> 
> >>> Yes, you'll have to write one or get the char * from the GooString and
> >>> use c-
> >>> string ones.
> >>> 
> >>> Albert
> >>> 
> >>>> -KB
> >>>> 
> >>>> On Sun, 30 May 2010, Albert Astals Cid wrote:
> >>>> > A Diumenge, 30 de maig de 2010, Kenneth Berland va escriure:
> >>>> >> 1)  Since I sent my last diff, I've:
> >>>> >>  	a) added some string processing to make sure no HTML reserved
> >>>> >>  	
> >>>> >> >> characters are placed into the output.  I process each word.
> >>>> >> >> 
> >>>> >> >>  	b) altered the html a bit so that XML parsers can deal with
> >>>> 
> >>>> it.
> >>>> 
> >>>> >> >> I've put in a title tag or an empty title tag and added end tags
> >>>> >> >> to
> >>>> 
> >>>> the
> >>>> 
> >>>> >> meta tags.
> >>>> >> 
> >>>> >> >> 2)  Addressing your concerns:
> >>>> >>  	a) I've removed the initialization of stdout.
> >>>> >>  	
> >>>> >> >>  	b) I close f now and reopen it.  This also removes the
> >>>> 
> >>>> warning.
> >>>> 
> >>>> >> >>  	c) If a user is running with the -bbox option, they want 
word
> >>>> >> >> 
> >>>> >> >> bounding boxes.  If there are no words, I think a line to stderr
> >>>> >> >> is
> >>>> >> 
> >>>> >> appropriate.
> >>>> >> 
> >>>> > > Cool, though we try not to use the std (yeah it sucks i know), can
> >>>> 
> >>>> you
> >>>> 
> >>>> > either use GooString or char *?
> >>>> > 
> >>>> > > > Thanks,
> >>>> > >  
> >>>> > >  Albert
> >>>> > >  
> >>>> > >> -KB
> >>>> > >> 
> >>>> >> >> On Wed, 26 May 2010, Albert Astals Cid wrote:
> >>>> >>> A Dimecres, 26 de maig de 2010, Kenneth Berland va escriure:
> >>>> >>>> I get a compiler warning without it.
> >>>> >>>> 
> >>>> >>>> >>>> pdftotext.cc: In function ‘int main(int, char**)’:
> >>>> >>>> pdftotext.cc:164: warning: ‘f’ may be used uninitialized in this
> >>>> >>>> function
> >>>> >>>> 
> >>>> >>> >>> That change will not get accepted, sorry, initializing f to
> >>>> 
> >>>> stdout is
> >>>> 
> >>>> >>> not a solution.
> >>>> >>> 
> >>>> >>> >>> Also i do not like the fact that you do not close f if you are
> >>>> 
> >>>> writing
> >>>> 
> >>>> >>> the bbox? Can't you just open it again like the code already does?
> >>>> >>> 
> >>>> >>> >>> Also i do not understand why the code considers a page having
> >>>> >>> >>> no
> >>>> 
> >>>> text
> >>>> 
> >>>> >>> an error.
> >>>> >>> 
> >>>> >>> >>> Albert
> >>>> >>> >>> 
> >>>> >>> >>>> -KB
> >>>> >>> >>>> 
> >>>> >>>> >>>> On Wed, 26 May 2010, Albert Astals Cid wrote:
> >>>> >>>>> A Diumenge, 9 de maig de 2010, Kenneth Berland va escriure:
> >>>> >>>>>> List,
> >>>> >>>>>> 
> >>>> >>>>>> >>>>>> I've attached a small addition to pdftotext that outputs
> >>>> 
> >>>> bounding
> >>>> 
> >>>> >>>>>> box information to html like this:
> >>>> >>>>>> >>>>>> <doc>
> >>>> >>>>>> >>>>>> 
> >>>> >>>>>> >>>>>>    <page width="612.000000" height="792.000000"/>
> >>>> >>>>>> >>>>>>    
> >>>> >>>>>> >>>>>>      <word xMin="56.800000" yMin="57.208000"
> >>>> 
> >>>> xMax="75.412000"
> >>>> 
> >>>> >>>>>> >>>>>> yMax="70.492000">The</word> </page>
> >>>> >>>>>> 
> >>>> >>>>>> </doc>
> >>>> >>>>>> 
> >>>> >>>>>> >>>>>> I had a need, maybe others will too.
> >>>> >>>>>> >>>>>> -KB
> >>>> >>>>> >>>>> 
> >>>> >>>>> >>>>> Why is this change necessary?
> >>>> >>>>> >>>>> -  FILE *f;
> >>>> >>>>> 
> >>>> >>>>> +  FILE *f = stdout;
> >>>> >>>>> 
> >>>> >>>>> >>>>> Albert
> >>>> >>> >>> 
> >>>> >>> >>> _______________________________________________
> >>>> >>> 
> >>>> >>> poppler mailing list
> >>>> >>> poppler at lists.freedesktop.org
> >>>> >>> http://lists.freedesktop.org/mailman/listinfo/poppler
> >>> 
> >>> _______________________________________________
> >>> poppler mailing list
> >>> poppler at lists.freedesktop.org
> >>> http://lists.freedesktop.org/mailman/listinfo/poppler


More information about the poppler mailing list