[HarfBuzz] please remove U+115F U+1160 from default_ignorable

Konstantin Ritt ritt.ks at gmail.com
Mon Mar 18 18:48:53 PDT 2013


> If "hiding default ignorable" means substituting by zero-width space,
> then the pdf file produced by xetex would not contain U+115F and
> U+1160, resulting in invalid text string when texts are extracted from
> the pdf.

Not exactly. As far as I understand, the correct output for UnBatang
font should be
[uni115F.ljmo04=1+1000|uni1161.vjmo02=1+0|uni112B.ljmo05=2+1000|uni1160.vjmo02=2+0]

And if we omit default ignorables that are part of a valid cluster,
then the output for JieubsidaBatang font would be
[uni115F.lj4=1+0|uni1161.nt=1+833|uni112B.lj5=2+0|uni1160.nt=2+833]
w/o having to "accumulate" the overall cluster advance in the cluster base.

> Only isolated filler may be ignored. Anyway this is illegal input string.

Indeed. That's why "placeholder" characters like Hangul fillers are
default_ignorables.
However, we have an option to preserve default ignorables for the case
when the rendering system is able to show something useful for a
broken input text or missing glyphs (i.e. to show the vowel on top of
a
dotted square to mention the fact that it is missing a leading
consonant before, etc.).

regards,
Konstantin



More information about the HarfBuzz mailing list