[HarfBuzz] Harfbuzz Sinhala (si) script support status update
Harshula
harshula at gmail.com
Mon Sep 3 00:11:42 PDT 2012
On Wed, 2012-07-25 at 16:49 -0400, Behdad Esfahbod wrote:
> On 07/25/2012 03:17 PM, Harshula wrote:
> >
> > Here are more details about the problem. The new shaper renders කො
> > (ko) incorrectly with FreeSerif and LKLUG fonts but renders correctly
> > with Bhashitha font (IIRC, originated from Windows). The old shaper
> > renders the string correctly using all three fonts.
> >
> > String: කො
> > Unicode Sequence: <U+0D9A,U+0DDC> (consonant + split dependent vowel)
> >
> > <U+0DDC> = <U+0DD9><U+0DCF>
>
> Ok, that explains. What Uniscribe does for Sinhala follows the Khmer spec,
> not the Indic spec (which I agree is unfortunate), so, instead of decomposing
> split matras according to Unicode, it adds the left part, then uses the
> original code for the rest (right) part of the matra.
>
> What it means that it does:
>
> <U+0DDC> = <U+0DD9><U+0DDC>
>
> Now, if the font has correct positioning for U+0DCF, doing the Unicode way
> should be enough.
>
> The new shaper follows Uniscribe here. You can disable that by removing a
> few lines from hb-unicode.cc. Just search for DDC. That would make the new
> shaper match HarfBuzz-old. Though, I consider the fonts broken if they don't
> work with Uniscribe.
>
> So, in this particular case, it may make sense to limit the Uniscribe behavior
> to the uniscribe-bug-compatibilty mode. Jonathan, what do you think?
I have been considering this conundrum and here are my thoughts:
1) Ideally any Unicode Sinhala font that 'works' with Uniscribe should
just 'work' on GNU/Linux.
2) I tested a few freely (cost) available fonts that used to be or are
now reference fonts. Two of these work correctly with both the *new* and
*old* shapers.
3) Taking (2) into account, it may be possible for existing GNU/Linux
fonts to support both the new and old shaper. However, I would have to
defer that to someone who knows their way around fonts better.
4) The way Uniscribe has implemented this is a bit unexpected, what
happens if Uniscribe changes/fixes this? Do we just blindly follow
Uniscribe?
As you can see, I am undecided on what might be the best direction.
cya,
#
> > NOTE: It appears LKLUG (using 'liga') and FreeSerif (using multiple
> > subs) construct U+0DDC from U+0DD9 and U+0DCF. However, Bhashitha
> > appears to deconstruct U+0DDC to form U+0DCF. I'm not good with font
> > rule construction, so it would be advisable for you to inspect the font
> > for accurate details.
> >
> > Thanks again for adding the old shaper!!!
> >
> > cya,
More information about the HarfBuzz
mailing list