[poppler] poppler-dump

Marco ctxspi at gmail.com
Thu Mar 13 02:11:36 PDT 2014


Il giorno 12/mar/2014 20:36, "Albert Astals Cid" <aacid at kde.org> ha scritto:
>
> >
> > El Dimecres, 12 de març de 2014, a les 20:25:45, Marco va escriure:
> > > Hi Albert
> > >
> > > Command 'pdftotext -layout filename.pdf -' it is the same if I use
> > > physical_layout in my small program, but if I have a pdf file with text
> > > into tables (I am sorry for my bad description), and I use command
> > > 'pdftotext filename.pdf -', it  give a results that I cannot display
> using
> > > 'raw_order_layout' or 'physical_layout' in my program.
> >
> > I'd say it is the other way around, poppler-dump can't give you what
> -layout
> > does.
> >
> > Compare the code of poppler-page.cpp and pdftottext, it's pretty
> straight-
> > forward.
> >
> > Cheers,
> >   Albert
> >
> > >
> > > 2014-03-12 19:49 GMT+01:00 Marco <ctxspi at gmail.com>:
> > > > Hi to all
> > > >
> > > > I am new user to poppler and I have a short question.
> > > >
> > > > In my small program I use these lines:
> > > >
> > > > for (int i = 0; i < pages; ++i) {
> > > >
> > > >     cout << "Page " << (i + 1) << "/" << pages << ":" << endl;
> > > >     auto_ptr<poppler::page> p(doc->create_page(i));
> > > >     poppler::byte_array text_ba = p.get()->text(p->page_rect(),
> > > >
> > > > poppler::page::raw_order_layout).to_utf8();
> > > >
> > > >     text_ba.push_back(0); // Add a NULL terminator for the C char *
> > > >     string text( text_ba.begin(), text_ba.end() );
> > > >     cout << text << endl;
> > > >     }
> > > >
> > > > to print text of file pdf, but using 'raw_order_layout' or
> > > > 'physical_layout' the output is different if I use the command
> 'pdftotext
> > > > filename.pdf -'.
> > > >
> > > >
> > > > How I can show text (but written in a pointer of char) as command
> > > > 'pdftotext filename.pdf -' ?
> > > >
> > > > Thank
> >
>
> Albert I'am sorry for mail incovenient.
--
I have tried it more times but I need to have in output not ustring data
but string or pointer of chars.

I need to have utf8 charset but not in the ustring format.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.freedesktop.org/archives/poppler/attachments/20140313/071f6c01/attachment.html>


More information about the poppler mailing list