Or better look <a href="http://www.unicode.org/reports/tr24/#Common">http://www.unicode.org/reports/tr24/#Common</a> and <a href="http://www.unicode.org/reports/tr24/#Nonspacing_Marks">http://www.unicode.org/reports/tr24/#Nonspacing_Marks</a><br clear="all"> <div><br>Konstantin</div> <br><br><div class="gmail_quote">2013/4/7 Khaled Hosny <span dir="ltr"><<a href="mailto:khaledhosny@eglug.org" target="_blank">khaledhosny@eglug.org</a>></span><br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"> Please note that characters with common or inherited script property<br> need special treatment, the corresponding Pango code is in<br> pango/pango-script.c<br> <br> Regards,<br> Khaled<br> <div class="im"><br> On Sun, Apr 07, 2013 at 02:07:15PM +0200, Lóránt Pintér wrote:<br> > Thanks. I could manage to do this by using ucdn_get_script() for now. BiDi is going to be the next big challenge.<br> ><br> > --<br> > Lóránt Pintér<br> </div>> Developer at Prezi (<a href="http://prezi.com" target="_blank">http://prezi.com</a>)<br> <div class="HOEnZb"><div class="h5">><br> ><br> ><br> > On Sunday, April 7, 2013 at 1:55 PM, Khaled Hosny wrote:<br> ><br> > > On Sun, Apr 07, 2013 at 02:59:32AM +0200, Lóránt Pintér wrote:<br> > > > Hi,<br> > > ><br> > > > I'm struggling with the problem of shaping mixed text. Say I have Thai<br> > > > and English text that I would like to shape. If I put all of it in a<br> > > > buffer, HarfBuzz chooses a shaper based on the first identifiable<br> > > > character, and then uses that shaper for the whole text. So<br> > > > "<thai><english>" gets shaped fine with the Thai shaper, but<br> > > > "<english><thai>" gets messed up because it is shaped with the default<br> > > > shaper.<br> > > ><br> > > > I was trying to figure out how Pango does this, but found nothing yet.<br> > > ><br> > > > Is it possible to ask HarfBuzz to identify text runs inside a buffer<br> > > > (or some other way) that can be shaped with different shapers? If<br> > > > there was a call that would identify the script (and maybe writing<br> > > > direction) of each character in the input, then I could split the<br> > > > buffer at positions where these a different script is used.<br> > > ><br> > ><br> > ><br> > > You have to split the text runs before passing them to HarfBuzz, etch<br> > > run should have the same script/language and text direction.<br> > ><br> > > Ideally text should be first itemized into runs with the same script,<br> > > and further split them into directional run according to BiDi algorithm.<br> > ><br> > > There are of course more subtleties involved, like when using multiple<br> > > fonts etc.<br> > ><br> > > Regards,<br> > > Khaled<br> > ><br> > ><br> ><br> ><br> </div></div><div class="HOEnZb"><div class="h5">_______________________________________________<br> HarfBuzz mailing list<br> <a href="mailto:HarfBuzz@lists.freedesktop.org">HarfBuzz@lists.freedesktop.org</a><br> <a href="http://lists.freedesktop.org/mailman/listinfo/harfbuzz" target="_blank">http://lists.freedesktop.org/mailman/listinfo/harfbuzz</a><br> </div></div></blockquote></div><br>