<html> <head> <base href="https://bugs.freedesktop.org/"> </head> <body> <div> <a class="bz_bug_link bz_status_NEW " title="NEW - rendering pdf and pdftotext give different results" href="https://bugs.freedesktop.org/show_bug.cgi?id=104085#c3">Comment # 3</a> on <a class="bz_bug_link bz_status_NEW " title="NEW - rendering pdf and pdftotext give different results" href="https://bugs.freedesktop.org/show_bug.cgi?id=104085">bug 104085</a> from <a class="email" href="mailto:jason@inspiresomeone.us" title="Jason Crain <jason@inspiresomeone.us>"> Jason Crain</a> <pre>(In reply to Rafał Mużyło from <a href="show_bug.cgi?id=104085#c2">comment #2</a>) > Why is it displayed correctly then ? Because the CMap is only used to look up the Unicode character for text extraction. Finding the glyph to draw is done using the character code or name. It might make more sense if you think of PDF as primarily a display format with text extraction and metadata support added on. > Yet, is there nothing pdftotext could do in such case ? I doubt it. It's doing what the PDF tells it to. If you show that Adobe Reader does it differently then maybe. > That is, are those two tables only info poppler gets from such pdf file wrt. > text content ? No, it's much more complicated. It's detailed in the Text section of the PDF reference.</pre> </div> <hr> You are receiving this mail because: <ul> <li>You are the assignee for the bug.</li> </ul> </body> </html>