[HarfBuzz] syllable boundaries and indic features

Jonathan Kew jfkthame at googlemail.com
Mon Sep 30 05:29:56 PDT 2013


Another issue for us to look at in Paris... writing it up now so we 
don't forget.

It looks to me like we may need to constrain Indic GSUB features to 
syllable boundaries more consistently (sadly). Failure to follow 
Uniscribe on this is leading to some poor results with NotoSansTelugu.

Testcase:
U+0C38,U+0C42,U+0C15,U+0C4D,U+0C37,U+0C4D,U+0C2E
using NotoSansTelugu-Regular.ttf.

Expected:
[gid54=0+1437|gid61=0+1368|gid101=2+1065|gid536=2+827]

Current HB output:
[gid54=0+1437|gid61=0+1368|gid101=2+1065|gid494=2+530]

What seems to be happening:

Initially, the <0C4D, 0C2E> sequence is replaced with glyph 494, by 
'blwf' (lookup 6).

Then glyph 494 is changed to glyph 536, by 'psts' contextual lookup 23, 
which calls substitution lookup 24.

(So far so good.)

But then glyph 536 is changed back(!) to glyph 494, thanks to 'psts' 
contextual lookup 25, which matches a backtrack context that extends 
into the preceding syllable.

Note that if we remove the initial (two-character) syllable group, the 
result is correct, as lookup 25 no longer applies:

U+0C15,U+0C4D,U+0C37,U+0C4D,U+0C2E
[gid101=0+1065|gid536=0+827]

So the problem arises because we don't allow the syllable boundary to 
constrain the application of "presentation form" features such as 'psts'.

Possible patch is attached; this delays the clearing of the syllable 
info until after all the GSUB indic features, not only the "basic" ones.

We do still need to clear syllables before GPOS, as it appears 
positioning features like 'kern' do get applied across syllable 
boundaries in Uniscribe.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: clear-indic-syllables.patch
Type: text/x-patch
Size: 1583 bytes
Desc: not available
URL: <http://lists.freedesktop.org/archives/harfbuzz/attachments/20130930/09af236f/attachment.bin>


More information about the HarfBuzz mailing list