[HarfBuzz] Harfbuzz Sinhala (si) script support status update

Behdad Esfahbod behdad at behdad.org
Wed Jul 25 13:49:07 PDT 2012


On 07/25/2012 03:17 PM, Harshula wrote:
> 
> Here are more details about the problem. The new shaper renders කො
> (ko) incorrectly with FreeSerif and LKLUG fonts but renders correctly
> with Bhashitha font (IIRC, originated from Windows). The old shaper
> renders the string correctly using all three fonts.
> 
> String: කො
> Unicode Sequence: <U+0D9A,U+0DDC> (consonant + split dependent vowel)
> 
> <U+0DDC> = <U+0DD9><U+0DCF>

Ok, that explains.  What Uniscribe does for Sinhala follows the Khmer spec,
not the Indic spec (which I agree is unfortunate), so, instead of decomposing
split matras according to Unicode, it adds the left part, then uses the
original code for the rest (right) part of the matra.

What it means that it does:

<U+0DDC> = <U+0DD9><U+0DDC>

Now, if the font has correct positioning for U+0DCF, doing the Unicode way
should be enough.

The new shaper follows Uniscribe here.   You can disable that by removing a
few lines from hb-unicode.cc.  Just search for DDC.  That would make the new
shaper match HarfBuzz-old.  Though, I consider the fonts broken if they don't
work with Uniscribe.

So, in this particular case, it may make sense to limit the Uniscribe behavior
to the uniscribe-bug-compatibilty mode.  Jonathan, what do you think?

behdad

> NOTE: It appears LKLUG (using 'liga') and FreeSerif (using multiple
> subs) construct U+0DDC from U+0DD9 and U+0DCF. However, Bhashitha
> appears to deconstruct U+0DDC to form U+0DCF. I'm not good with font
> rule construction, so it would be advisable for you to inspect the font
> for accurate details.
> 
> Thanks again for adding the old shaper!!!
> 
> cya,



More information about the HarfBuzz mailing list