[HarfBuzz] Harfbuzz Sinhala (si) script support status update

Harshula harshula at gmail.com
Mon Sep 3 00:11:42 PDT 2012


On Wed, 2012-07-25 at 16:49 -0400, Behdad Esfahbod wrote:
> On 07/25/2012 03:17 PM, Harshula wrote:
> > 
> > Here are more details about the problem. The new shaper renders කො
> > (ko) incorrectly with FreeSerif and LKLUG fonts but renders correctly
> > with Bhashitha font (IIRC, originated from Windows). The old shaper
> > renders the string correctly using all three fonts.
> > 
> > String: කො
> > Unicode Sequence: <U+0D9A,U+0DDC> (consonant + split dependent vowel)
> > 
> > <U+0DDC> = <U+0DD9><U+0DCF>
> 
> Ok, that explains.  What Uniscribe does for Sinhala follows the Khmer spec,
> not the Indic spec (which I agree is unfortunate), so, instead of decomposing
> split matras according to Unicode, it adds the left part, then uses the
> original code for the rest (right) part of the matra.
> 
> What it means that it does:
> 
> <U+0DDC> = <U+0DD9><U+0DDC>
> 
> Now, if the font has correct positioning for U+0DCF, doing the Unicode way
> should be enough.
> 
> The new shaper follows Uniscribe here.   You can disable that by removing a
> few lines from hb-unicode.cc.  Just search for DDC.  That would make the new
> shaper match HarfBuzz-old.  Though, I consider the fonts broken if they don't
> work with Uniscribe.
> 
> So, in this particular case, it may make sense to limit the Uniscribe behavior
> to the uniscribe-bug-compatibilty mode.  Jonathan, what do you think?

I have been considering this conundrum and here are my thoughts:

1) Ideally any Unicode Sinhala font that 'works' with Uniscribe should
just 'work' on GNU/Linux.

2) I tested a few freely (cost) available fonts that used to be or are
now reference fonts. Two of these work correctly with both the *new* and
*old* shapers.

3) Taking (2) into account, it may be possible for existing GNU/Linux
fonts to support both the new and old shaper. However, I would have to
defer that to someone who knows their way around fonts better.

4) The way Uniscribe has implemented this is a bit unexpected, what
happens if Uniscribe changes/fixes this? Do we just blindly follow
Uniscribe?

As you can see, I am undecided on what might be the best direction.

cya,
#

> > NOTE: It appears LKLUG (using 'liga') and FreeSerif (using multiple
> > subs) construct U+0DDC from U+0DD9 and U+0DCF. However, Bhashitha
> > appears to deconstruct U+0DDC to form U+0DCF. I'm not good with font
> > rule construction, so it would be advisable for you to inspect the font
> > for accurate details.
> > 
> > Thanks again for adding the old shaper!!!
> > 
> > cya,





More information about the HarfBuzz mailing list