<html> <head> <base href="https://bugs.freedesktop.org/" /> </head> <body> <div> <a class="bz_bug_link bz_status_REOPENED " title="REOPENED --- - Ligated characters are drawn multiple times when selected" href="https://bugs.freedesktop.org/show_bug.cgi?id=9001#c16">Comment # 16</a> on <a class="bz_bug_link bz_status_REOPENED " title="REOPENED --- - Ligated characters are drawn multiple times when selected" href="https://bugs.freedesktop.org/show_bug.cgi?id=9001">bug 9001</a> from <a class="email" href="mailto:ed@catmur.co.uk" title="Ed Catmur <ed@catmur.co.uk>"> Ed Catmur</a> <pre>(In reply to <a href="show_bug.cgi?id=9001#c15">comment #15</a>) > Patch works. I wonder if we should return the ligatures as a single > character instead so that we don't need a special case. We currently support selecting individual characters within a ligature; it'd be a shame to lose that. > ::: poppler/TextOutputDev.cc > @@ +2392,4 @@ > > w1 /= uLen; > > h1 /= uLen; > > for (i = 0; i < uLen; ++i) { > > + if (i > 0) c = CHARCODE_LIGATED; > > Could you explain why this means it's a ligature? uLen is greater than 1 when a single CharCode (i.e. a glyph) signifies multiple Unicode codepoints. In English text, that typically occurs when the glyph is a ligature. In other scripts that might not be the case (<a href="http://unicode.org/Public/UNIDATA/NamedSequences.txt">http://unicode.org/Public/UNIDATA/NamedSequences.txt</a>) but if so we don't handle those correctly anyway; by chopping up the space occupied by the glyph we're assuming it's a ligature. Maybe a better name like CHARCODE_GLYPH_CONTINUATION would be clearer?</pre> </div> <hr> You are receiving this mail because: <ul> <li>You are the assignee for the bug.</li> </ul> </body> </html>