[HarfBuzz] [Indic] Inspecting the font for consonant position with Free Sans

Behdad Esfahbod behdad at behdad.org
Tue Mar 5 16:58:33 PST 2013

With Free Sans, and a sequence like <U+0924,U+094D,U+0930> (TA,Virama,RA),
HarfBuzz is failing to for the correct conjunct while Uniscribe doesn't.

The problem is: Free Sans has both deva and dev2 tables.  HarfBuzz correctly
chooses dev2.  Now, we then proceed to find the base.  For that, we check
whether "Virama,RA" matches the blwf feature.  It doesn't.  As such we take RA
to be the syllable base, and things go down from there.

Uniscribe somehow figures it out.  Assuming that Uniscribe is also using the
dev2 table, I guessed that Free Sans has wrong blwf tables (simply copied over
from the deva table) and as such matches 'RA,Virama' instead of 'Virama,RA'.
Indeed, looking for that makes HarfBuzz correctly shape the sequence.

So, I'm going to change the Indic shaper to look for both sequences, Virama
first and Virama last, when categorizing consonants.  It's a bummer that we
should take such performance hit because of some broken fonts, but that
shouldn't be a big deal.

I can't think of any situation that this would have unwanted consequences.

Will implement, test, and push.


