<html> <head> <base href="https://bugs.documentfoundation.org/"> </head> <body> <p> <div> <b><a class="bz_bug_link bz_status_NEW " title="NEW - Incorrect glyph to Unicode mappings in PDFs (Graphite)" href="https://bugs.documentfoundation.org/show_bug.cgi?id=62846#c51">Comment # 51</a> on <a class="bz_bug_link bz_status_NEW " title="NEW - Incorrect glyph to Unicode mappings in PDFs (Graphite)" href="https://bugs.documentfoundation.org/show_bug.cgi?id=62846">bug 62846</a> from <span class="vcard"><a class="email" href="mailto:martin_hosken@sil.org" title="martin_hosken@sil.org">martin_hosken@sil.org</a> </span></b> <pre>Sorry to be somewhat brutal. But until we get the PDF writer to produce the necessary PDF to allow for data extraction, using tagged PDF, it doesn't matter what magic we do with our fonts, it isn't going to work. You can give example after example, it won't help fix the problem. One of the difficulties with attaching text to a PDF text run is that the text has to be output before the glyphs that give the presentation. So there are a number of tradeoffs we can employ in resolving this. So I'll ask, which you prefer: speed vs size? Do you want to make small PDFs that only output unicode strings for runs that really need them, but take a bit longer to produce (since the strings have to be analysed to make the decision) or do you OK with having a complete copy of the text in your pdf? Do we want to make this an option that says: make me extractable PDF or do we always want to generate extractable PDF even if the result is bigger or slower to produce?</pre> </div> </p> <hr> <span>You are receiving this mail because:</span> <ul> <li>You are the assignee for the bug.</li> </ul> </body> </html>