[poppler] How to make text extracted from tables more readable

Nishanth Lawrence r.nishanth.cse at gmail.com
Wed Nov 27 01:44:50 PST 2013


Hi ,
I am using pdftotext version 0.24.2 . Following is my sample case


   -

   First line
   -

   second line
   -

   third line
   -

   fourth line
   -

   fifth line


   -

   First line
   -

   second line , but second line is so big that it actually takes a new
   line
   -

   third line
   -

   fourth line
   -

   fifth line



While extracting using the following command line utility

pdftotext table.pdf table.txt  -layout -nopgbrk -q

I am getting the following output

    First line                                    First line
    second line                                second line , but second
line is so big that
    third line                                    it actually takes a new
line
    fourth line                                  third line
    fifth line                                     fourth line
                                                     fifth line

So what I want is ,  if there in no bullet in any of the line then there
should be empty line in opposite column , could you please tell me what to
change in the code so that I could get

First line                                    First line
second line                                second line , but second line is
so big that
                                                it actually takes a new line
third line                                    third line
fourth line                                  fourth line
fifth line                                     fifth line


Thanks in advance

-- 
With Regards
Nishanth R Lawrence
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.freedesktop.org/archives/poppler/attachments/20131127/0c915db8/attachment.html>


More information about the poppler mailing list