[HarfBuzz] please remove U+115F U+1160 from default_ignorable

Konstantin Ritt ritt.ks at gmail.com
Mon Mar 18 15:27:47 PDT 2013


Hi folks,

With UnBatang font, I'm getting the folowing shaping for "U+115F
U+1161 U+112B U+1160" sample:
[uni0020=1+0|uni1161.vjmo02=2+0|uni112B.ljmo05=3+1000|uni0020=4+0]
which is wrong since it hides first syllable (also note the cluster
mapping; I'd expect 2 clusters rather than 4).
when U+115F and U+1160 are removed from default ignorables, the
shaping looks more correct for the sample text:
[uni115F.ljmo04=1+1000|uni1161.vjmo02=2+0|uni112B.ljmo05=3+1000|uni1160.vjmo02=4+0]
(ditto about the cluster mapping).
however, when the filler appears in an isolated form, it doesn't get
shaped as non-advancing, which is wrong too (AFAIK).

Also, with JieubsidaBatang font, I'm getting an extra advance at right:
[uni115F.lj4=1+0|uni1161.nt=2+833|uni112B.lj5=3+0|uni1160.nt=4+833]

IIUC, a correct solution would be determining Hangul clusters and then
setting the "accumulated" cluster advance to the cluster base prior to
hiding default ignorables. Correct me if I'm wrong.

regards,
Konstantin


2013/3/16 Dohyun Kim <nomosnomos at gmail.com>:
> OK.  Attached is a pdf file showing the difference between two xetex version.
> The result of old version which uses ICU layout engine is correct one.
>  I have drawn boundary lines around the sample texts, so that we can
> easily catch the difference.
>
> 2013/3/16 Konstantin Ritt <ritt.ks at gmail.com>:
>> Hi,
>>
>> Can you attach a screenshots of these characters rendered with both
>> old and new XeTeX and some of the mentioned fonts?
>> A screenshots of "U+115F U+1161 U+112B U+1160" sample could be useful too.
>>
>> Konstantin
>>
>>
>> 2013/3/16 Dohyun Kim <nomosnomos at gmail.com>:
>>> Hi,
>>>
>>> While testing new version of xetex which uses harfbuzz-ng for opentype
>>> rendering, I have encountered a serious issue about Hangul Jamo
>>> typesetting.  The reason is that U+115F and U+1160 are assigned to
>>> "default_ignorable" code points in hb-unicode-private.hh.
>>>
>>> Certainly, according to unicode standard, these two characters are
>>> Default_Ignorable_Code_Point.  However, although the exact meaning of
>>> "default ignorable code point" is not always clear to me, I am 100%
>>> sure that these two characters should not be ignored in opentype
>>> rendering.
>>>
>>> Any Hangul fonts currently available gives wrong output with current
>>> version of harfbuzz-ng.  Take any font supporting Hangul Jamo, eg.
>>> malgun.ttf in windows 8, jieubsida otf at
>>> http://sourceforge.jp/projects/tsukurimashou/, unbatang ttf at
>>> http://kldp.net/projects/unfonts/, or HCR-LVT fonts which is currently
>>> not accessible but was avaliable at
>>> http://ftp.ktug.or.kr/KTUG/hcr-lvt/.  Then run hb-shape --script=hang
>>> with input string "U+115F U+1161 U+112B U+1160".  We get three
>>> zero-width glyphs instead of two; this is wrong.
>>>
>>> So please remove U+115F and U+1160 from default_ignorable code points,
>>> whatever the unicode standard says about them.
>>>
>>> Regards,
>>> --
>>> Dohyun Kim
>>> College of Law, Dongguk University
>>> Seoul, Republic of Korea
>>> _______________________________________________
>>> HarfBuzz mailing list
>>> HarfBuzz at lists.freedesktop.org
>>> http://lists.freedesktop.org/mailman/listinfo/harfbuzz
>
>
>
> --
> Dohyun Kim
> College of Law, Dongguk University
> Seoul, Republic of Korea
>
> _______________________________________________
> HarfBuzz mailing list
> HarfBuzz at lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/harfbuzz
>



More information about the HarfBuzz mailing list