[HarfBuzz] Question regarding the use of HB_SCRIPT_KATAKANA for "regular" Japanese

Sun Dec 22 07:10:56 PST 2013

I'm trying to render "regular" (i.e. modern, horizontal) Japanese with
Harfbuzz.

So far, I have been using HB_SCRIPT_KATAKANA and it looks similar to what
is rendered via browsers.

But after examining other rendering solutions I can see that "automatic
script detection" can often take place.

For instance, the Mapnik project is using ICU's "scrptrun", which, given
the following sentence:

ユニコードは、すべての文字に固有の番号を付与します

would detect a mix of Katakana, Hiragana and Han scripts.

But for instance, it would not change anything if I'd render the sentence
by mixing the 3 different scripts (i.e. instead of using only
HB_SCRIPT_KATAKANA.)

Or are there situations where it would make a difference?

I'm asking that because I suspect a catch-22 situation here. For example,
the word "diameter" in Japanese is 直径 which, given to "scrptrun" would be
detected as Han script.

As far as I understand, it could be a problem on systems where
DroidSansFallback.ttf is used, because the word would look like in
Simplified Chinese.

Now, if we were using MTLmr3m.ttf, which is preferred for Japanese, the
word would have been rendered as intended.

Reference: https://code.google.com/p/chromium/issues/detail?id=183830

Any feedback would be appreciated. Note that the wisdom accumulated here
will be translated into tangible info and code samples (see
https://github.com/arielm/Unicode)

Thanks!
Ariel
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.freedesktop.org/archives/harfbuzz/attachments/20131222/006f1a20/attachment.html>