[poppler] pdftotext raw
Massimo Redaelli
mredaelli at lari.digital
Thu May 16 15:00:27 UTC 2019
Hey all.
Question regarding pdftotext.
The help says that `raw` is not recommended anymore, but for all PDFs
I tried it actually gives better results than the default mode, by
which I mean that paragraphs are not interrupted by extraneous text,
like headers or boxes.
(I do have to handle hyphenated words, but that looks easy.)
Is the option going to be deprecated, or can we count on it being
there for the foreseeable future?
Are there reasons not to use it?
Thanks!
--
M.
More information about the poppler
mailing list