[HarfBuzz] Hangul Shaper (was Re: an issue regarding discrepancy between Korean and Unicode standards
Dohyun Kim
nomosnomos at gmail.com
Tue Apr 16 07:10:09 PDT 2013
2013/4/16 Dohyun Kim <nomosnomos at gmail.com>:
> 2013/4/15 Dohyun Kim <nomosnomos at gmail.com>:
>>
>> The behavior of new Uniscribe is quote confusing and seems to be
>> inconsistant on some points. I cannot describe concisely what it
>> does. But it is evident that it renders correctly only those input
>> sequence which is compliant to KS X 1026-1.
>>
>
> OK. My guess about the behavior of new Uniscribe:
>
> 1. demarcate syllable blocks according to KS X 1026-1
>
> - between L and L, V and V, T and T, or L and T (these are illegal string)
> - between V and L, T and L, or M and L (these are legal break point)
> - between Jamo and non-Jamo character including Hangul syllables
> - but not between L and V, V and T, T and M, V and M, LVT and M, LV and M.
Oh, I have left out one stunning thing. I really dislike this sort of behavior:
- The Jamo sequence of <L V T> is divided into <L | V | T>, if
equivalent <LVT> syllable exists.
- Likewise, <L V> sequence is divided into <L | V>, if it is not
followed by T and equivalent <LV> syllable exists.
>
> where LVT and LV are Hangul syllables; L, V, and T are Jamos; M means
> Hangul tone marks (U+302E or U+302F)
>
> 2. reorder Hangul tone marks
>
> - if syllable block is well-formed, move M from the last to the
> first of the cluster.
> - if syllable is not well-formed, Uniscribe does not move M.
> Instead, U+25CC is inserted after M.
>
> where "well-formed" means <LVT>, <LV>, <L V T>, or <L V>.
>
--
Dohyun Kim
College of Law, Dongguk University
Seoul, Republic of Korea
More information about the HarfBuzz
mailing list