[poppler] How to make text extracted from tables more readable

Nishanth Lawrence r.nishanth.cse at gmail.com
Mon Dec 2 07:21:04 PST 2013


Hi ,
Sorry my previous mail was not formatted correctly due to tables , so I
have given links to google docs .

I am using pdftotext version 0.24.2 . Following is my case

https://drive.google.com/file/d/0Bwj-LRZNYWXvTXVZNHNyQnNNd00/edit?usp=sharing

While extracting using the following command line utility

pdftotext table.pdf table.txt  -layout -nopgbrk -q

I am getting the following output

https://docs.google.com/file/d/0Bwj-LRZNYWXvSGdwa2FXemtydDQ/edit

So what I want is ,  if there in no bullet in any of the line then there
should be empty line in opposite column , could you please tell me what to
change in the code so that I could get an output similar to this

https://docs.google.com/file/d/0Bwj-LRZNYWXvck9jMmQtWFU1VkU/edit

Or at least  which part of the code has to be modified to achieve the above
.

Thanks in advance



-- 
With Regards
Nishanth R Lawrence
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.freedesktop.org/archives/poppler/attachments/20131202/03d93836/attachment.html>


More information about the poppler mailing list