[HarfBuzz] an issue regarding discrepancy between Korean and Unicode standards
nomosnomos at gmail.com
Wed Mar 20 20:03:12 PDT 2013
When a sample input string, say "U+1100 U+1161 U+11F0", is processed
by current version of harfbuzz with some fonts, eg. malgun.ttf bundled
with windows 8, we get something like "U+AC00 U+11F0", which is not
good in its visual result.
The reason is that there is discrepancy between Korean industrial
standad (KS X 1026-1: 2007) and Unicode normalization rule.
Malgun.ttf observes Korean standard only and does not care about
international unicode standard for normalizaiton. FYI, an English
translation of KS X 1026-1 is available at
Normalization done by current harfbuzz is of course compliant with
unicode standard. "U+AC00 U+11F0", ie. precomposed character in
Hangul syllable block followed by trailing consonant Jamo letter, is
perfectly legal and is canonically identical to "U+1100 U+1161
U+11F0". According to KS X 1026-1, however, this should not occur.
Section 5.3 of the Korean standard says: "A Wanseong syllable
block(U+AC00..U+D7A3) cannot be recomposed with Johab Hangul
letters(U+1100..U+11FF U+A960..U+A97C U+D7B0..U+D7FB) to represent
another Hangul syllable block." See also section 6.4 of this
I have hesitated about posting this issue as harfbuzz is observing
unicode normalization rule. We cannot say it is a bug, and many other
libraries including glib and icu is doing the same as harfbuzz. I
believe that font developers should care about unicode standard as
well, which some fonts (jieupsida and hcr-lvt) are already supporting.
But as there are other fonts (malgun.ttf and unbatang.ttf) which do
not give us good result with current harfbuzz, I am now raising this
issue. Above all, malgun.ttf is now the default Hangul font for the
most widely used OS here in Korea. I have little knowledge about
programming languages, but the Korean standard mentioned above has
some sample code in its appendix.
College of Law, Dongguk University
Seoul, Republic of Korea
More information about the HarfBuzz