[poppler] page.text() does not take page orientation into account?
Jeroen Ooms
jeroen.ooms at stat.ucla.edu
Tue Mar 8 22:34:28 UTC 2016
When extracting text from a landscape pdf file using the cpp
interface, text at the far right of the page does not get extracted .I
think the problem is that page.text() always assumes portrait
orientation and hence underestimates the width of the page:
p->text()
p->text(p->page_rect())
Is this expected? What is the best way to extract all text from the
page, irrespective of size and orientation?
An example landscape pdf is here:
https://github.com/ropensci/pdftools/files/161587/waurika_news_democrat.pdf
More information about the poppler
mailing list