[HarfBuzz] application of hangul *jmo features to broken syllables

Behdad Esfahbod behdad at behdad.org
Mon Jan 6 16:36:38 PST 2014


On 14-01-07 04:30 AM, Jonathan Kew wrote:
> Maybe we should try something along the lines of the attached patch. The idea
> here is to ONLY apply the *jmo features to jamo that are specifically
> identified by the shaper, rather than applying them globally.
> 
> The extra work being done here should be offset by the fact that we won't
> generally be trying as many non-matching lookups on every glyph; we'll only
> apply the lookups for (at most) one of the *jmo features, instead of all of them.
> 
> WDYT?

Looks good.  WDYT about applying all *jmo features to the decomposed syllable?
 I'm thinking, if someone's doing contextual matching, it wouldn't match right
now...

Since you are hacking on this already and I'm at linux.conf.au, would you mind
implementing the remaining bits (tone-marks and old jamo)?

I was thinking about moving this to a ragel-based machine.  That can wait
until we get the logic right though.

Cheers,
behdad

> On 6/1/14 14:44, Jonathan Kew wrote:
>> The Hangul shaper should NOT apply the *jmo features to glyphs that are
>> not part of a properly-structured Korean syllable.
>>
>> Some examples of current behavior:
>>
>> [GOOD: complete LVT sequence, proper features applied]
>> hb-unicode-encode 1101,1161,11f0 | hb-shape UnBatang_0613.ttf
>> [uni1101.ljmo01=0+1024|uni1161.vjmo01=1+0|uni11F0.tjmo01=2+0]
>>
>> [GOOD: lone L does not have features applied]
>> hb-unicode-encode 1101 | hb-shape UnBatang_0613.ttf
>> [uni1101=0+1024]
>>
>> [GOOD: LT without V is not valid, don't apply features]
>> hb-unicode-encode 1101,11f0 | hb-shape UnBatang_0613.ttf
>> [uni1101=0+1024|uni11F0=1+0]
>>
>> [GOOD: lone T, don't apply features]
>> hb-unicode-encode 11f0 | hb-shape UnBatang_0613.ttf
>> [uni11F0=0+0]
>>
>> [BAD: lone V, shouldn't apply vjmo]
>> hb-unicode-encode 1161 | hb-shape UnBatang_0613.ttf
>> [uni1161.vjmo02=0+0]
>>
>> [BAD: VT pair without L is not valid, shouldn't apply *jmo]
>> hb-unicode-encode 1161,11f0 | hb-shape UnBatang_0613.ttf
>> [uni1161.vjmo01=0+0|uni11F0.tjmo01=1+0]
>>
>> Note the last two examples; these are incorrect, IMO. In both these
>> cases, Uniscribe does not apply the *jmo features.
>>
>> JK
> 

-- 
behdad
http://behdad.org/


More information about the HarfBuzz mailing list