[HarfBuzz] Itemising Japanese scripts

Muller, Eric emuller at amazon.com
Mon Apr 25 14:02:10 UTC 2016


On 4/24/16 3:18 PM, Simon Cozens wrote:
> On 25/04/2016 08:05, Khaled Hosny wrote:
>> The problem with merging is which script tag to select for the merged run,
>> Kana or Hani or “it depends on the font”.
> Why does it matter what script tag to apply if there are no opentype
> interactions with Japanese?


As for Unicode: the accepted definition of /script/ is a set of 
characters that form a complete system for writing sounds. Of course, 
any definition of that word needs interpretation, but katakana and 
hirigana are both complete, i.e. one could possibly write Japanese 
entirely in katakana, or in hiragana. Unicode uses the separate notion 
of /writing system/ to speak about the use of characters to write a 
language; most of the time, a writing system uses a single script, but 
Japanese is the notable exception. The Korean writing system is another 
example (hangul + han).

It's also noteworthy that ISO 15924 provides a tags for combination of 
scripts:

Hanb     503     Han with Bopomofo (alias for Han + Bopomofo)
Hrkt     412     Japanese syllabaries (alias for Hiragana + Katakana)
Jpan     413     Japanese (alias for Han + Hiragana + Katakana)
Kore     287     Korean (alias for Hangul + Han)

and that the likely subtags 
(http://www.unicode.org/cldr/charts/latest/supplemental/likely_subtags.html) 
reflect that:

ja -> ja_Jpan_JP


Even if there are no typographic interaction between kanji, katakana and 
hiragana glyphs, there is a need to apply OT features. It's hard to 
imagine that having ~5 character runs can be as efficient as applying 
them to whole paragraphs. On that basis, I have asked for a long time 
for an OT equivalent of Jpan (which does not preclude the presence of 
hani and kana).

Eric.





> Jpan     413     Japanese (alias for Han + Hiragana + Katakana)

Hanb 	503 	Han with Bopomofo (alias for Han + Bopomofo)






  (although, because that notion is not critical for Unicode, it's

>
> On the other hand, I have just remembered one interaction: a pan-CJK
> font such as Source Han Sans / Noto Sans CJK will have variant forms of
> the kanji for Chinese, Japanese and Korean. But even then the selection
> should be done on language, not on script - I haven't checked how it works.
>
> So if pushed I would say Kana, just in case. But it really shouldn't matter.
> _______________________________________________
> HarfBuzz mailing list
> HarfBuzz at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/harfbuzz

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/harfbuzz/attachments/20160425/07a609df/attachment.html>


More information about the HarfBuzz mailing list