[HarfBuzz] hangul shaper patches

Dohyun Kim nomosnomos at gmail.com
Thu Jan 23 02:24:23 PST 2014


Thank you for the detailed explanation, Jonathan.
I have forwarded your last email to the author of jieubsida fonts.

Best,

2014/1/23 Jonathan Kew <jfkthame at googlemail.com>:
> On 23/1/14 03:39, Dohyun Kim wrote:
>>
>> I've just found that jieubsida fonts [1] from Tsukurimashou Font
>> Project [2] do not work well with current hangul shaper.
>>
>> ~$ hb-unicode-encode AC00 | hb-shape --script=hang JieubsidaBatang.otf
>> [uni1100=0+0|uni1161=0+833]
>>
>> Expected output is:
>>
>> ~$ hb-unicode-encode AC00 | hb-shape --script=latn
>> --features=ljmo,vjmo JieubsidaBatang.otf
>> [uniAC00=0+833]
>>
>> The reason seems to be that hangul shaper is currently applying *jmo
>> features too early. The author of jieubsida fonts has intended to
>> apply *jmo features after ccmp feature, and so arranged the order of
>> gsub lookup tables. But hangul shaper is applying *jmo features before
>> everything else.
>
>
> I don't think that's quite accurate; rather, the issue occurs because the
> hangul shaper isn't applying *jmo features to glyphs that result from ccmp
> decomposition. And then because the *jmo features haven't been applied to
> choose contextual forms of the jamos, the ligature that was expected to
> re-compose the syllable doesn't match either. See below.
>
>
>>
>> I am curious about what the output on windows 8 machine is, which is
>> not available to me for now.
>
>
> With --shaper=uniscribe on a Win8 machine, I get the "incorrect" output
> [uni1100=0+0|uni1161=0+833], matching harfbuzz behavior.
>
> So I think this is a font error. The font is using ccmp to decompose the
> syllable AC00 into L and V jamos, but then expecting the shaper to apply
> *jmo features to the resulting glyphs. That doesn't work, because
> decomposing via ccmp has no awareness of the hangul-specific syllable
> structure.
>
> (Then, after choosing contextual forms of the jamos, it expects to use liga
> to reassemble them into the single glyph for the syllable.)
>
> A syllable such as AC00 will be decomposed into jamos *if necessary* by code
> within the shaper itself, in which case it will also apply features
> appropriately. The font should *not* use the generic ccmp feature to
> decompose it, unless it intends to do *everything* using generic global
> features, not the hangul-specific features.
>
> I guess this font used to work because the old "dumb" hangul shaper applied
> the *jmo features globally, but this is not how they're intended to be used,
> and is not how uniscribe works. The shaper is now applying the features
> selectively, as intended.
>
> So the font is using the wrong strategy. It should be simplified to remove
> the syllable decompositions from ccmp; that's handled by the shaper itself.
> (And it doesn't need the liga feature to reassemble the original syllables,
> either, as the shaper won't decompose them unless actually necessary, e.g.
> to support an <LV, T> sequence.)
>
> JK
>



-- 
Dohyun Kim
Seoul, Republic of Korea


More information about the HarfBuzz mailing list