[poppler] commit? bounding box html in pdftotext

Kenneth Berland ken at hero.com
Wed Sep 22 10:12:08 PDT 2010


I have rewritten the replace function with standard C.

-KB


On Sun, 11 Jul 2010, Albert Astals Cid wrote:

> A Dimarts, 6 de juliol de 2010, Kenneth Berland va escriure:
>
>> Can I use std::string within any GooString methods I write (e.g. replace) 
>> or am I limited to the C Standard library (i.e. string.h)?
>
> No std:: usage anywhere in poppler (except in the cpp frontend).
>
> Albert
>
>
> On Mon, 5 Jul 2010, Kenneth Berland wrote:
>
>> Can I use std::string within any GooString methods I write (e.g. replace) 
>> or am I limited to the C Standard library (i.e. string.h)?
>> 
>> -KB
>> 
>> 
>> On Tue, 8 Jun 2010, Albert Astals Cid wrote:
>> 
>>> A Dimarts, 8 de juny de 2010, vàreu escriure:
>>>> Does GooString have a replace() method?  I could not find one.  Does this
>>>> mean I should write one?
>>> 
>>> Yes, you'll have to write one or get the char * from the GooString and use 
>>> c-
>>> string ones.
>>> 
>>> Albert
>>> 
>>>> 
>>>> -KB
>>>> 
>>>> On Sun, 30 May 2010, Albert Astals Cid wrote:
>>>> > A Diumenge, 30 de maig de 2010, Kenneth Berland va escriure:
>>>> >> 1)  Since I sent my last diff, I've:
>>>> >>  	a) added some string processing to make sure no HTML reserved
>>>> >> >> characters are placed into the output.  I process each word.
>>>> >> >>  	b) altered the html a bit so that XML parsers can deal with 
>>>> it.
>>>> >> >> I've put in a title tag or an empty title tag and added end tags to 
>>>> the
>>>> >> meta tags.
>>>> >> >> 2)  Addressing your concerns:
>>>> >>  	a) I've removed the initialization of stdout.
>>>> >> >>  	b) I close f now and reopen it.  This also removes the 
>>>> warning.
>>>> >> >>  	c) If a user is running with the -bbox option, they want word
>>>> >> >> bounding boxes.  If there are no words, I think a line to stderr is
>>>> >> appropriate.
>>>> > > Cool, though we try not to use the std (yeah it sucks i know), can 
>>>> you
>>>> > either use GooString or char *?
>>>> > > > Thanks,
>>>> > >  Albert
>>>> > >> -KB
>>>> >> >> On Wed, 26 May 2010, Albert Astals Cid wrote:
>>>> >>> A Dimecres, 26 de maig de 2010, Kenneth Berland va escriure:
>>>> >>>> I get a compiler warning without it.
>>>> >>>> >>>> pdftotext.cc: In function ‘int main(int, char**)’:
>>>> >>>> pdftotext.cc:164: warning: ‘f’ may be used uninitialized in this
>>>> >>>> function
>>>> >>> >>> That change will not get accepted, sorry, initializing f to 
>>>> stdout is
>>>> >>> not a solution.
>>>> >>> >>> Also i do not like the fact that you do not close f if you are 
>>>> writing
>>>> >>> the bbox? Can't you just open it again like the code already does?
>>>> >>> >>> Also i do not understand why the code considers a page having no 
>>>> text
>>>> >>> an error.
>>>> >>> >>> Albert
>>>> >>> >>>> -KB
>>>> >>>> >>>> On Wed, 26 May 2010, Albert Astals Cid wrote:
>>>> >>>>> A Diumenge, 9 de maig de 2010, Kenneth Berland va escriure:
>>>> >>>>>> List,
>>>> >>>>>> >>>>>> I've attached a small addition to pdftotext that outputs 
>>>> bounding
>>>> >>>>>> box information to html like this:
>>>> >>>>>> >>>>>> <doc>
>>>> >>>>>> >>>>>>    <page width="612.000000" height="792.000000"/>
>>>> >>>>>> >>>>>>      <word xMin="56.800000" yMin="57.208000" 
>>>> xMax="75.412000"
>>>> >>>>>> >>>>>> yMax="70.492000">The</word> </page>
>>>> >>>>>> </doc>
>>>> >>>>>> >>>>>> I had a need, maybe others will too.
>>>> >>>>>> >>>>>> -KB
>>>> >>>>> >>>>> Why is this change necessary?
>>>> >>>>> >>>>> -  FILE *f;
>>>> >>>>> +  FILE *f = stdout;
>>>> >>>>> >>>>> Albert
>>>> >>> >>> _______________________________________________
>>>> >>> poppler mailing list
>>>> >>> poppler at lists.freedesktop.org
>>>> >>> http://lists.freedesktop.org/mailman/listinfo/poppler
>>> _______________________________________________
>>> poppler mailing list
>>> poppler at lists.freedesktop.org
>>> http://lists.freedesktop.org/mailman/listinfo/poppler


More information about the poppler mailing list