<html> <head> <base href="https://bugs.freedesktop.org/" /> </head> <body> <div> <a class="bz_bug_link bz_status_NEW " title="NEW --- - Handling of small caps typographic variants" href="https://bugs.freedesktop.org/show_bug.cgi?id=38456#c6">Comment # 6</a> on <a class="bz_bug_link bz_status_NEW " title="NEW --- - Handling of small caps typographic variants" href="https://bugs.freedesktop.org/show_bug.cgi?id=38456">bug 38456</a> from <a class="email" href="mailto:jason@aquaticape.us" title="Jason Crain <jason@aquaticape.us>"> Jason Crain</a> <pre>(In reply to <a href="show_bug.cgi?id=38456#c5">comment #5</a>) > Can you have a look at <a href="https://bugs.kde.org/attachment.cgi?id=23655">https://bugs.kde.org/attachment.cgi?id=23655</a> ? > > There's a whole lot of text missing in pdftotext with your path, the part > that says This is another one I can't fix. The document is using character names in the form of /GXX, where X are hex characters specifying a Unicode point. I can't think of a way to reliably guess the document's intention to either parse the name or use the character code. It looks like it will be a choice between supporting the documents in bugs #38456 and #72753 (use character code) or this document and the FAO document (parse name).</pre> </div> <hr> You are receiving this mail because: <ul> <li>You are the assignee for the bug.</li> </ul> </body> </html>