[HarfBuzz] Hangul Shaper (was Re: an issue regarding discrepancy between Korean and Unicode standards

Dohyun Kim nomosnomos at gmail.com
Wed Apr 17 18:44:06 PDT 2013


2013/4/18 Dohyun Kim <nomosnomos at gmail.com>:
> 2013/4/18 Behdad Esfahbod <behdad at behdad.org>:
>> When are the OpenType features applied, after all those processes are done?
>
> If possible, please apply "ccmp" feature before all those processes.

On a second thought, now I think it is more efficient and compliant to
the unicode standard to apply "ccmp" feature after decomposition of
hangul syllables and before setting syllable boundaries.

> And "*jmo" features after all those processes.
>
>> Are the '*jmo' features applied to all glyphs?
>
> No. Only to those well-formed syllable block <M? L V T?>.
>
>>
>> On 13-04-16 11:29 PM, Dohyun Kim wrote:
>>> http://ktug.org/~nomos/harfbuzz-hangul/hangulshaper.pdf
>>>
>>> Regards,
>>>
>>> 2013/4/17 Behdad Esfahbod <behdad at behdad.org>:
>>>> Ok, given how confusing this thread has become, please create a Google Doc,
>>>> and write down what you think the HarfBuzz Hangul shaper should do.  Modify it
>>>> as much as you want, but keep it as short as possible.  Please make the doc
>>>> commentable by the public, and send the link here.
>>>>
>>>> Thanks,
>>>> behdad
>>>>
>>>> On 13-04-16 10:10 AM, Dohyun Kim wrote:
>>>>> 2013/4/16 Dohyun Kim <nomosnomos at gmail.com>:
>>>>>> 2013/4/15 Dohyun Kim <nomosnomos at gmail.com>:
>>>>>>>
>>>>>>> The behavior of new Uniscribe is quote confusing and seems to be
>>>>>>> inconsistant on some points.  I cannot describe concisely what it
>>>>>>> does.  But it is evident that it renders correctly only those input
>>>>>>> sequence which is compliant to KS X 1026-1.
>>>>>>>
>>>>>>
>>>>>> OK.  My guess about the behavior of new Uniscribe:
>>>>>>
>>>>>> 1.  demarcate syllable blocks according to KS X 1026-1
>>>>>>
>>>>>>    - between L and L, V and V, T and T, or L and T (these are illegal string)
>>>>>>    - between V and L, T and L, or M and L (these are legal break point)
>>>>>>    - between Jamo and non-Jamo character including Hangul syllables
>>>>>>    - but not between L and V, V and T, T and M, V and M, LVT and M, LV and M.
>>>>>
>>>>> Oh, I have left out one stunning thing.  I really dislike this sort of behavior:
>>>>>
>>>>>    - The Jamo sequence of <L V T> is divided into <L | V | T>, if
>>>>> equivalent <LVT> syllable exists.
>>>>>    - Likewise, <L V> sequence is divided into <L | V>, if it is not
>>>>> followed by T and equivalent <LV> syllable exists.
>>>>>
>>>>>>
>>>>>> where LVT and LV are Hangul syllables; L, V, and T are Jamos; M means
>>>>>> Hangul tone marks (U+302E or U+302F)
>>>>>>
>>>>>> 2.  reorder Hangul tone marks
>>>>>>
>>>>>>     - if syllable block is well-formed, move M from the last to the
>>>>>> first of the cluster.
>>>>>>     - if syllable is not well-formed, Uniscribe does not move M.
>>>>>> Instead, U+25CC is inserted after M.
>>>>>>
>>>>>> where "well-formed" means <LVT>, <LV>, <L V T>, or <L V>.
>>>>>>
>>>>>
>>>>> --
>>>>> Dohyun Kim
>>>>> College of Law, Dongguk University
>>>>> Seoul, Republic of Korea
>>>>>
>>>>
>>>> --
>>>> behdad
>>>> http://behdad.org/
>>>
>>>
>>>
>>
>> --
>> behdad
>> http://behdad.org/
>
>
>
> --
> Dohyun Kim
> College of Law, Dongguk University
> Seoul, Republic of Korea



--
Dohyun Kim
College of Law, Dongguk University
Seoul, Republic of Korea



More information about the HarfBuzz mailing list