<div dir="ltr"><span style="font-family:arial,sans-serif;font-size:13px">> As it happens, those three scripts are all considered "simple", so the shaping</span><br style="font-family:arial,sans-serif;font-size:13px">
<span style="font-family:arial,sans-serif;font-size:13px">> </span><span style="font-family:arial,sans-serif;font-size:13px">logic in HarfBuzz is the same for all three.</span><div><span style="font-family:arial,sans-serif;font-size:13px"><br>
</span></div><div><span style="font-family:arial,sans-serif;font-size:13px">Good to know. For the record, there's a </span><span style="font-family:arial,sans-serif;font-size:13px">function for checking if a script is complex</span><span style="font-family:arial,sans-serif;font-size:13px"> in the recent Harfbuzz-flavored Android OS: </span><a href="http://goo.gl/KL1KUi" target="_blank" style="font-family:arial,sans-serif">http://goo.gl/KL1KUi</a></div>
<div><br></div><div><span style="font-family:arial,sans-serif;font-size:13px">> Where it does make a difference</span><br style="font-family:arial,sans-serif;font-size:13px">
<span style="font-family:arial,sans-serif;font-size:13px">> </span><span style="font-family:arial,sans-serif;font-size:13px">is if the font has ligatures, kerning, etc for those. OpenType organizes</span><br style="font-family:arial,sans-serif;font-size:13px">
<span style="font-family:arial,sans-serif;font-size:13px">> </span><span style="font-family:arial,sans-serif;font-size:13px">those features by script, and if you request the wrong script you will miss</span><br style="font-family:arial,sans-serif;font-size:13px">
<span style="font-family:arial,sans-serif;font-size:13px">> </span><span style="font-family:arial,sans-serif;font-size:13px">out on the features.</span><br></div><div><span style="font-family:arial,sans-serif;font-size:13px"><br>
</span></div><div><font face="arial, sans-serif">Makes sense to me for Hebrew, Arabic, Thai, etc., </font><font face="arial, sans-serif">but I was bit surprised to find-out that LATN was also a complex script.</font></div>
<div><span style="font-family:arial,sans-serif"><br></span></div><div><span style="font-family:arial,sans-serif">So for instance, if I would shape some text containing Hebrew and English solely using the HEBR script, I would probably loose kerning and ffi-like ligatures for the english part (this is what I'm actually doing </span><span style="font-family:arial,sans-serif">currently </span><span style="font-family:arial,sans-serif">in my "simple" BIDI implementation...)</span></div>
<div><span style="font-family:arial,sans-serif"><br></span></div><div>> How you do font selection and what script you pass to HarfBuzz are two<br>> completely separate issues. Font fallback stack should be per-language.<span style="font-family:arial,sans-serif"><br>
</span></div><div><br></div><div>I understand that the best scenario will always be to take decisions based on "language" rather than solely on "script", but it creates a problem:</div><div><br></div>
<div>
Say you work on an API for Unicode text rendering: you can't promise your users a solution where they would use arbitrary text without providing language-context per span.</div><div><br></div><div>Or, to come back to the origin of the message: solutions like ICU's "scrptrun" which are doing script detection are not appropriate (because they won't help you finding the right font due to the lack of language context...)</div>
<div><br></div><div>I guess the problem is even more generic, like with utf8-encoded html pages rendered in modern browsers, as demonstrated by the creator of liblinebreak: <a href="http://wyw.dcweb.cn/lang_utf8.htm">http://wyw.dcweb.cn/lang_utf8.htm</a></div>
<div class="gmail_extra"><br><div class="gmail_quote">On Sun, Dec 22, 2013 at 10:47 PM, Behdad Esfahbod <span dir="ltr"><<a href="mailto:behdad@behdad.org" target="_blank">behdad@behdad.org</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div>On 13-12-22 10:10 AM, Ariel Malka wrote:<br>
> I'm trying to render "regular" (i.e. modern, horizontal) Japanese with Harfbuzz.<br>
><br>
> So far, I have been using HB_SCRIPT_KATAKANA and it looks similar to what is<br>
> rendered via browsers.<br>
><br>
> But after examining other rendering solutions I can see that "automatic script<br>
> detection" can often take place.<br>
><br>
> For instance, the Mapnik project is using ICU's "scrptrun", which, given the<br>
> following sentence:<br>
><br>
> $B%f%K%3!<%I$O!"$9$Y$F$NJ8;z$K8GM-$NHV9f$rIUM?$7$^$9(B<br>
><br>
> would detect a mix of Katakana, Hiragana and Han scripts.<br>
><br>
> But for instance, it would not change anything if I'd render the sentence by<br>
> mixing the 3 different scripts (i.e. instead of using only HB_SCRIPT_KATAKANA.)<br>
><br>
> Or are there situations where it would make a difference?<br>
<br>
</div>As it happens, those three scripts are all considered "simple", so the shaping<br>
logic in HarfBuzz is the same for all three. Where it does make a difference<br>
is if the font has ligatures, kerning, etc for those. OpenType organizes<br>
those features by script, and if you request the wrong script you will miss<br>
out on the features.<br>
<div><br>
<br>
> I'm asking that because I suspect a catch-22 situation here. For example, the<br>
> word "diameter" in Japanese is $BD>7B(B which, given to "scrptrun" would be<br>
> detected as Han script.<br>
><br>
> As far as I understand, it could be a problem on systems where<br>
> DroidSansFallback.ttf is used, because the word would look like in Simplified<br>
> Chinese.<br>
><br>
> Now, if we were using MTLmr3m.ttf, which is preferred for Japanese, the word<br>
> would have been rendered as intended.<br>
<br>
</div>How you do font selection and what script you pass to HarfBuzz are two<br>
completely separate issues. Font fallback stack should be per-language.<br>
<div><br>
> Reference: <a href="https://code.google.com/p/chromium/issues/detail?id=183830" target="_blank">https://code.google.com/p/chromium/issues/detail?id=183830</a><br>
><br>
> Any feedback would be appreciated. Note that the wisdom accumulated here will<br>
> be translated into tangible info and code samples (see<br>
> <a href="https://github.com/arielm/Unicode" target="_blank">https://github.com/arielm/Unicode</a>)<br>
><br>
> Thanks!<br>
> Ariel<br>
><br>
><br>
</div>> _______________________________________________<br>
> HarfBuzz mailing list<br>
> <a href="mailto:HarfBuzz@lists.freedesktop.org" target="_blank">HarfBuzz@lists.freedesktop.org</a><br>
> <a href="http://lists.freedesktop.org/mailman/listinfo/harfbuzz" target="_blank">http://lists.freedesktop.org/mailman/listinfo/harfbuzz</a><br>
><br>
<span><font color="#888888"><br>
--<br>
behdad<br>
<a href="http://behdad.org/" target="_blank">http://behdad.org/</a><br>
</font></span></blockquote></div><br></div></div>