<html> <head> <base href="https://bugs.freedesktop.org/" /> </head> <body><table border="1" cellspacing="0" cellpadding="8"> <tr> <th>Bug ID</th> <td><a class="bz_bug_link bz_status_NEW " title="NEW - page.text() does not take page orientation into account" href="https://bugs.freedesktop.org/show_bug.cgi?id=94517">94517</a> </td> </tr> <tr> <th>Summary</th> <td>page.text() does not take page orientation into account </td> </tr> <tr> <th>Product</th> <td>poppler </td> </tr> <tr> <th>Version</th> <td>unspecified </td> </tr> <tr> <th>Hardware</th> <td>Other </td> </tr> <tr> <th>OS</th> <td>All </td> </tr> <tr> <th>Status</th> <td>NEW </td> </tr> <tr> <th>Severity</th> <td>normal </td> </tr> <tr> <th>Priority</th> <td>medium </td> </tr> <tr> <th>Component</th> <td>cpp frontend </td> </tr> <tr> <th>Assignee</th> <td>poppler-bugs@lists.freedesktop.org </td> </tr> <tr> <th>Reporter</th> <td>jeroen.ooms@stat.ucla.edu </td> </tr></table> <p> <div> <pre>See also: <a href="https://lists.freedesktop.org/archives/poppler/2016-March/011755.html">https://lists.freedesktop.org/archives/poppler/2016-March/011755.html</a> When extracting text from a landscape pdf file using the cpp interface, text at the far right of the page does not get extracted .I think the problem is that page.text() always assumes portrait orientation and hence underestimates the width of the page: p->text() p->text(p->page_rect()) Is this expected? What is the best way to extract all text from the page, irrespective of size and orientation? An example landscape pdf is here: <a href="https://github.com/ropensci/pdftools/files/161587/waurika_news_democrat.pdf">https://github.com/ropensci/pdftools/files/161587/waurika_news_democrat.pdf</a></pre> </div> </p> <hr> <span>You are receiving this mail because:</span> <ul> <li>You are the assignee for the bug.</li> </ul> </body> </html>