Or better look <a href="http://www.unicode.org/reports/tr24/#Common">http://www.unicode.org/reports/tr24/#Common</a> and <a href="http://www.unicode.org/reports/tr24/#Nonspacing_Marks">http://www.unicode.org/reports/tr24/#Nonspacing_Marks</a><br clear="all">
<div><br>Konstantin</div>
<br><br><div class="gmail_quote">2013/4/7 Khaled Hosny <span dir="ltr"><<a href="mailto:khaledhosny@eglug.org" target="_blank">khaledhosny@eglug.org</a>></span><br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
Please note that characters with common or inherited script property<br>
need special treatment, the corresponding Pango code is in<br>
pango/pango-script.c<br>
<br>
Regards,<br>
Khaled<br>
<div class="im"><br>
On Sun, Apr 07, 2013 at 02:07:15PM +0200, Lóránt Pintér wrote:<br>
> Thanks. I could manage to do this by using ucdn_get_script() for now. BiDi is going to be the next big challenge.<br>
><br>
> --<br>
> Lóránt Pintér<br>
</div>> Developer at Prezi (<a href="http://prezi.com" target="_blank">http://prezi.com</a>)<br>
<div class="HOEnZb"><div class="h5">><br>
><br>
><br>
> On Sunday, April 7, 2013 at 1:55 PM, Khaled Hosny wrote:<br>
><br>
> > On Sun, Apr 07, 2013 at 02:59:32AM +0200, Lóránt Pintér wrote:<br>
> > > Hi,<br>
> > ><br>
> > > I'm struggling with the problem of shaping mixed text. Say I have Thai<br>
> > > and English text that I would like to shape. If I put all of it in a<br>
> > > buffer, HarfBuzz chooses a shaper based on the first identifiable<br>
> > > character, and then uses that shaper for the whole text. So<br>
> > > "<thai><english>" gets shaped fine with the Thai shaper, but<br>
> > > "<english><thai>" gets messed up because it is shaped with the default<br>
> > > shaper.<br>
> > ><br>
> > > I was trying to figure out how Pango does this, but found nothing yet.<br>
> > ><br>
> > > Is it possible to ask HarfBuzz to identify text runs inside a buffer<br>
> > > (or some other way) that can be shaped with different shapers? If<br>
> > > there was a call that would identify the script (and maybe writing<br>
> > > direction) of each character in the input, then I could split the<br>
> > > buffer at positions where these a different script is used.<br>
> > ><br>
> ><br>
> ><br>
> > You have to split the text runs before passing them to HarfBuzz, etch<br>
> > run should have the same script/language and text direction.<br>
> ><br>
> > Ideally text should be first itemized into runs with the same script,<br>
> > and further split them into directional run according to BiDi algorithm.<br>
> ><br>
> > There are of course more subtleties involved, like when using multiple<br>
> > fonts etc.<br>
> ><br>
> > Regards,<br>
> > Khaled<br>
> ><br>
> ><br>
><br>
><br>
</div></div><div class="HOEnZb"><div class="h5">_______________________________________________<br>
HarfBuzz mailing list<br>
<a href="mailto:HarfBuzz@lists.freedesktop.org">HarfBuzz@lists.freedesktop.org</a><br>
<a href="http://lists.freedesktop.org/mailman/listinfo/harfbuzz" target="_blank">http://lists.freedesktop.org/mailman/listinfo/harfbuzz</a><br>
</div></div></blockquote></div><br>