[poppler] commit? bounding box html in pdftotext

Albert Astals Cid aacid at kde.org
Sat Sep 25 05:35:32 PDT 2010


A Dissabte, 25 de setembre de 2010, Kenneth Berland va escriure:
> Is this on-track for being committed?

Yes, it is, it's on the queue of lots of things of poppler related things i 
have to do and will be commited when i find time for it.

Thanks for the patch and sorry for the delay.

Albert

> 
> (sorry to bug you)
> 
> -KB
> 
> On Wed, 22 Sep 2010, Kenneth Berland wrote:
> > Very funny.
> > 
> > The old diff, using std:: is at:
> > 
> > http://lists.freedesktop.org/archives/poppler/attachments/20100530/898252
> > 75/attachment.txt
> > 
> > You can commit either today's diff or the 2010-05-30 (std::) diff.  I
> > think the std:: version is less likely to have pointer-related bugs.
> > 
> > -KB
> > 
> > On Wed, 22 Sep 2010, Albert Astals Cid wrote:
> >> A Dimecres, 22 de setembre de 2010, Kenneth Berland va escriure:
> >>> I have rewritten the replace function with standard C.
> >> 
> >> Now is when you hate me but since a few weeks we accept std:: code if
> >> it's *obvious* it adds value over existing code.
> >> 
> >> So can you please send again your patch to the mailing list?
> >> 
> >> Sorry, totally forgot to tell you.
> >> 
> >> Albert
> >> 
> >>> -KB
> >>> 
> >>> On Sun, 11 Jul 2010, Albert Astals Cid wrote:
> >>>> A Dimarts, 6 de juliol de 2010, Kenneth Berland va escriure:
> >>>>> Can I use std::string within any GooString methods I write (e.g.
> >>>>> replace) or am I limited to the C Standard library (i.e. string.h)?
> >>>> 
> >>>> No std:: usage anywhere in poppler (except in the cpp frontend).
> >>>> 
> >>>> Albert
> >>>> 
> >>>> On Mon, 5 Jul 2010, Kenneth Berland wrote:
> >>>>> Can I use std::string within any GooString methods I write (e.g.
> >>>>> replace) or am I limited to the C Standard library (i.e. string.h)?
> >>>>> 
> >>>>> -KB
> >>>>> 
> >>>>> On Tue, 8 Jun 2010, Albert Astals Cid wrote:
> >>>>>> A Dimarts, 8 de juny de 2010, vàreu escriure:
> >>>>>>> Does GooString have a replace() method?  I could not find one. 
> >>>>>>> Does this mean I should write one?
> >>>>>> 
> >>>>>> Yes, you'll have to write one or get the char * from the GooString
> >>>>>> and use c-
> >>>>>> string ones.
> >>>>>> 
> >>>>>> Albert
> >>>>>> 
> >>>>>>> -KB
> >>>>>>> 
> >>>>>>> On Sun, 30 May 2010, Albert Astals Cid wrote:
> >>>>>>>> A Diumenge, 30 de maig de 2010, Kenneth Berland va escriure:
> >>>>>>>>> 1)  Since I sent my last diff, I've:
> >>>>>>>>>  	a) added some string processing to make sure no HTML reserved
> >>>>>>>>>  	
> >>>>>>>>>>> characters are placed into the output.  I process each word.
> >>>>>>>>>>> 
> >>>>>>>>>>>  	b) altered the html a bit so that XML parsers can deal with
> >>>>>>> 
> >>>>>>> it.
> >>>>>>> 
> >>>>>>>>>>> I've put in a title tag or an empty title tag and added end
> >>>>>>>>>>> tags to
> >>>>>>> 
> >>>>>>> the
> >>>>>>> 
> >>>>>>>>> meta tags.
> >>>>>>>>> 
> >>>>>>>>>>> 2)  Addressing your concerns:
> >>>>>>>>>  	a) I've removed the initialization of stdout.
> >>>>>>>>>  	
> >>>>>>>>>>>  	b) I close f now and reopen it.  This also removes the
> >>>>>>> 
> >>>>>>> warning.
> >>>>>>> 
> >>>>>>>>>>>  	c) If a user is running with the -bbox option, they want
> >> 
> >> word
> >> 
> >>>>>>>>>>> bounding boxes.  If there are no words, I think a line to
> >>>>>>>>>>> stderr is
> >>>>>>>>> 
> >>>>>>>>> appropriate.
> >>>>>>>>> 
> >>>>>>>>> Cool, though we try not to use the std (yeah it sucks i know),
> >>>>>>>>> can
> >>>>>>> 
> >>>>>>> you
> >>>>>>> 
> >>>>>>>> either use GooString or char *?
> >>>>>>>> 
> >>>>>>>>>> Thanks,
> >>>>>>>>>> 
> >>>>>>>>>  Albert
> >>>>>>>>>  
> >>>>>>>>>> -KB
> >>>>>>>>>> 
> >>>>>>>>>>> On Wed, 26 May 2010, Albert Astals Cid wrote:
> >>>>>>>>>> A Dimecres, 26 de maig de 2010, Kenneth Berland va escriure:
> >>>>>>>>>>> I get a compiler warning without it.
> >>>>>>>>>>> 
> >>>>>>>>>>>>>>> pdftotext.cc: In function ‘int main(int, char**)’:
> >>>>>>>>>>> pdftotext.cc:164: warning: ‘f’ may be used uninitialized in
> >>>>>>>>>>> this function
> >>>>>>>>>>> 
> >>>>>>>>>>>>> That change will not get accepted, sorry, initializing f to
> >>>>>>> 
> >>>>>>> stdout is
> >>>>>>> 
> >>>>>>>>>> not a solution.
> >>>>>>>>>> 
> >>>>>>>>>>>>> Also i do not like the fact that you do not close f if you
> >>>>>>>>>>>>> are
> >>>>>>> 
> >>>>>>> writing
> >>>>>>> 
> >>>>>>>>>> the bbox? Can't you just open it again like the code already
> >>>>>>>>>> does?
> >>>>>>>>>> 
> >>>>>>>>>>>>> Also i do not understand why the code considers a page having
> >>>>>>>>>>>>> no
> >>>>>>> 
> >>>>>>> text
> >>>>>>> 
> >>>>>>>>>> an error.
> >>>>>>>>>> 
> >>>>>>>>>>>>> Albert
> >>>>>>>>>>>>> 
> >>>>>>>>>>>>>> -KB
> >>>>>>>>>>>>>> 
> >>>>>>>>>>>>>>> On Wed, 26 May 2010, Albert Astals Cid wrote:
> >>>>>>>>>>>> A Diumenge, 9 de maig de 2010, Kenneth Berland va escriure:
> >>>>>>>>>>>>> List,
> >>>>>>>>>>>>> 
> >>>>>>>>>>>>>>>>>>> I've attached a small addition to pdftotext that
> >>>>>>>>>>>>>>>>>>> outputs
> >>>>>>> 
> >>>>>>> bounding
> >>>>>>> 
> >>>>>>>>>>>>> box information to html like this:
> >>>>>>>>>>>>>>>>>>> <doc>
> >>>>>>>>>>>>>>>>>>> 
> >>>>>>>>>>>>>>>>>>>    <page width="612.000000" height="792.000000"/>
> >>>>>>>>>>>>>>>>>>>    
> >>>>>>>>>>>>>>>>>>>      <word xMin="56.800000" yMin="57.208000"
> >>>>>>> 
> >>>>>>> xMax="75.412000"
> >>>>>>> 
> >>>>>>>>>>>>>>>>>>> yMax="70.492000">The</word> </page>
> >>>>>>>>>>>>> 
> >>>>>>>>>>>>> </doc>
> >>>>>>>>>>>>> 
> >>>>>>>>>>>>>>>>>>> I had a need, maybe others will too.
> >>>>>>>>>>>>>>>>>>> -KB
> >>>>>>>>>>>>>>>>> 
> >>>>>>>>>>>>>>>>> Why is this change necessary?
> >>>>>>>>>>>>>>>>> -  FILE *f;
> >>>>>>>>>>>> 
> >>>>>>>>>>>> +  FILE *f = stdout;
> >>>>>>>>>>>> 
> >>>>>>>>>>>>>>>>> Albert
> >>>>>>>>>>>>> 
> >>>>>>>>>>>>> _______________________________________________
> >>>>>>>>>> 
> >>>>>>>>>> poppler mailing list
> >>>>>>>>>> poppler at lists.freedesktop.org
> >>>>>>>>>> http://lists.freedesktop.org/mailman/listinfo/poppler
> >>>>>> 
> >>>>>> _______________________________________________
> >>>>>> poppler mailing list
> >>>>>> poppler at lists.freedesktop.org
> >>>>>> http://lists.freedesktop.org/mailman/listinfo/poppler


More information about the poppler mailing list