[poppler] commit? bounding box html in pdftotext
Kenneth Berland
ken at hero.com
Wed Sep 22 11:18:57 PDT 2010
Yikes,
I have rewritten the replace function with standard C (and attached it
this time.)
-KB
On Wed, 22 Sep 2010, Kenneth Berland wrote:
> I have rewritten the replace function with standard C.
>
> -KB
>
>
> On Sun, 11 Jul 2010, Albert Astals Cid wrote:
>
>> A Dimarts, 6 de juliol de 2010, Kenneth Berland va escriure:
>>
>>> Can I use std::string within any GooString methods I write (e.g. replace)
>>> or am I limited to the C Standard library (i.e. string.h)?
>>
>> No std:: usage anywhere in poppler (except in the cpp frontend).
>>
>> Albert
>>
>>
>> On Mon, 5 Jul 2010, Kenneth Berland wrote:
>>
>>> Can I use std::string within any GooString methods I write (e.g. replace)
>>> or am I limited to the C Standard library (i.e. string.h)?
>>>
>>> -KB
>>>
>>>
>>> On Tue, 8 Jun 2010, Albert Astals Cid wrote:
>>>
>>>> A Dimarts, 8 de juny de 2010, vàreu escriure:
>>>>> Does GooString have a replace() method? I could not find one. Does
>>>>> this
>>>>> mean I should write one?
>>>>
>>>> Yes, you'll have to write one or get the char * from the GooString and
>>>> use c-
>>>> string ones.
>>>>
>>>> Albert
>>>>
>>>>>
>>>>> -KB
>>>>>
>>>>> On Sun, 30 May 2010, Albert Astals Cid wrote:
>>>>> > A Diumenge, 30 de maig de 2010, Kenneth Berland va escriure:
>>>>> >> 1) Since I sent my last diff, I've:
>>>>> >> a) added some string processing to make sure no HTML reserved
>>>>> >> >> characters are placed into the output. I process each word.
>>>>> >> >> b) altered the html a bit so that XML parsers can deal with
>>>>> it.
>>>>> >> >> I've put in a title tag or an empty title tag and added end tags
>>>>> to the
>>>>> >> meta tags.
>>>>> >> >> 2) Addressing your concerns:
>>>>> >> a) I've removed the initialization of stdout.
>>>>> >> >> b) I close f now and reopen it. This also removes the
>>>>> warning.
>>>>> >> >> c) If a user is running with the -bbox option, they want word
>>>>> >> >> bounding boxes. If there are no words, I think a line to stderr
>>>>> is
>>>>> >> appropriate.
>>>>> > > Cool, though we try not to use the std (yeah it sucks i know), can
>>>>> you
>>>>> > either use GooString or char *?
>>>>> > > > Thanks,
>>>>> > > Albert
>>>>> > >> -KB
>>>>> >> >> On Wed, 26 May 2010, Albert Astals Cid wrote:
>>>>> >>> A Dimecres, 26 de maig de 2010, Kenneth Berland va escriure:
>>>>> >>>> I get a compiler warning without it.
>>>>> >>>> >>>> pdftotext.cc: In function ‘int main(int, char**)’:
>>>>> >>>> pdftotext.cc:164: warning: ‘f’ may be used uninitialized in this
>>>>> >>>> function
>>>>> >>> >>> That change will not get accepted, sorry, initializing f to
>>>>> stdout is
>>>>> >>> not a solution.
>>>>> >>> >>> Also i do not like the fact that you do not close f if you are
>>>>> writing
>>>>> >>> the bbox? Can't you just open it again like the code already does?
>>>>> >>> >>> Also i do not understand why the code considers a page having no
>>>>> text
>>>>> >>> an error.
>>>>> >>> >>> Albert
>>>>> >>> >>>> -KB
>>>>> >>>> >>>> On Wed, 26 May 2010, Albert Astals Cid wrote:
>>>>> >>>>> A Diumenge, 9 de maig de 2010, Kenneth Berland va escriure:
>>>>> >>>>>> List,
>>>>> >>>>>> >>>>>> I've attached a small addition to pdftotext that outputs
>>>>> bounding
>>>>> >>>>>> box information to html like this:
>>>>> >>>>>> >>>>>> <doc>
>>>>> >>>>>> >>>>>> <page width="612.000000" height="792.000000"/>
>>>>> >>>>>> >>>>>> <word xMin="56.800000" yMin="57.208000"
>>>>> xMax="75.412000"
>>>>> >>>>>> >>>>>> yMax="70.492000">The</word> </page>
>>>>> >>>>>> </doc>
>>>>> >>>>>> >>>>>> I had a need, maybe others will too.
>>>>> >>>>>> >>>>>> -KB
>>>>> >>>>> >>>>> Why is this change necessary?
>>>>> >>>>> >>>>> - FILE *f;
>>>>> >>>>> + FILE *f = stdout;
>>>>> >>>>> >>>>> Albert
>>>>> >>> >>> _______________________________________________
>>>>> >>> poppler mailing list
>>>>> >>> poppler at lists.freedesktop.org
>>>>> >>> http://lists.freedesktop.org/mailman/listinfo/poppler
>>>> _______________________________________________
>>>> poppler mailing list
>>>> poppler at lists.freedesktop.org
>>>> http://lists.freedesktop.org/mailman/listinfo/poppler
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: diff.txt
URL: <http://lists.freedesktop.org/archives/poppler/attachments/20100922/e9e41f0c/attachment.txt>
More information about the poppler
mailing list