[poppler] page.text() does not take page orientation into account?

Albert Astals Cid aacid at kde.org
Wed Apr 20 21:05:59 UTC 2016


El dimecres, 13 d’abril de 2016, a les 16:57:14 CEST, Jeroen Ooms va escriure:
> On Tue, Mar 8, 2016 at 2:34 PM, Jeroen Ooms <jeroen.ooms at stat.ucla.edu> 
wrote:
> > When extracting text from a landscape pdf file using the cpp
> > interface, text at the far right of the page does not get extracted .I
> > think the problem is that page.text() always assumes portrait
> > 
> > orientation and hence underestimates the width of the page:
> >   p->text()
> >   p->text(p->page_rect())
> > 
> > Is this expected? What is the best way to extract all text from the
> > page, irrespective of size and orientation?
> > 
> > An example landscape pdf is here:
> > https://github.com/ropensci/pdftools/files/161587/waurika_news_democrat.pd
> > f
> 
> I would still be very interested in a fix or workaround for this
> problem. I tried looking through the source but I don't understand it
> well enough to figure out what is going wrong here. All help would be
> really appreciated.

If you haven't, i'd suggest opening a bug, it won't get it immediately fixed, 
but it will make sure it's not forgotten and in case someone bored walks 
around it may evne get fixed.

Cheers,
  Albert

> _______________________________________________
> poppler mailing list
> poppler at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/poppler




More information about the poppler mailing list