[HarfBuzz] 'calt' in Indic shaper

Jonathan Kew jfkthame at googlemail.com
Wed Aug 7 03:45:23 PDT 2013


On 7/8/13 00:22, John Hudson wrote:
> Dear Behdad,
>
>>> So apparently Uniscribe isn't applying 'calt' by default in Gujarati.
>>> That
>>> kinda seems reasonable, given that the Indic specs refer to it as
>>> "Discretionary presentation forms", although it differs from Adobe's
>>> registration of the feature (targeted at cursive Latin fonts), which
>>> says it
>>> should be active by default.
>
> I don't think that is what what 'discretionary' usually means in the
> contexts of Microsoft's Indic font specs. If 'calt' is not being applied
> in Gujurati, I think that might be a bug and something you should query
> with Andrew Glass (currently responsible for Uniscribe, and cc'd).
>

I'll be interested to hear Andrew's thoughts on this.

A somewhat analogous case might be discretionary features in Arabic, 
where (according to http://support.microsoft.com/kb/2786400 - I didn't 
test this explicitly) the behavior in Win7 was recently changed to NOT 
apply "discretionary ligatures and contextual alternates" by default. 
("Discretionary ligatures" there clearly refers to 'dlig'; I wonder 
whether "contextual alternates" means 'calt', or whether perhaps it 
really refers to 'cswh'?)

The Indic 'calt' discrepancy we ran into that triggered this question 
appears to actually be the features-across-cluster-boundaries issue. In 
uniscribe, 'calt' does not apply across cluster boundaries; in harfbuzz, 
it does. This leads to many unexpected differences: e.g. in Gujarati, 
the "pseudo-ligation" of <U+0A9C,U+0AAF>, and in Devanagari, the 
insertion of a top-bar-extender glyph after many clusters with both an 
above-vowel and an anusvara, e.g. in <U+092C,U+0948,U+0902,U+0915>. This 
often gives an inferior appearance; the font designer must, I assume, 
have intended this insertion for specific problem cases -within- a 
cluster, to avoid overcrowding the above marks, but not -between- clusters.

So as things stand, disabling 'calt' in harfbuzz significantly improves 
our results for Devanagari with Nirmala; however, in Gujarati, while it 
fixes the discrepancy for <U+0A9C,U+0AAF>, it introduces new problems 
such as <U+0A97,U+0ACD,U+0AA8> where uniscribe -is- applying 'calt' 
within the cluster to choose an alternate form of the NA glyph.

>
> With regard to 'consistency with Uniscribe', I am wary of that as a
> criterion for layout behaviour. Sometimes what Uniscribe does is
> difficult to correlate with the font specs for a given script
> (Malayalam, for instance), and as I documented there are serious issues
> for Indic typography that result from failure to apply lookups across
> cluster boundaries:
> http://www.tiro.com/John/Problems_for_Indic_Typography.pdf

Agreed - and we're certainly prepared to deviate from uniscribe where it 
is clear that its behavior is broken, inadequate or otherwise 
problematic. But OTOH where the specs are underdefined, or where it's 
more a question of picking among alternative models - such as whether a 
given "discretionary" feature is opt-in or opt-out - there's a great 
deal to be said for consistency, and in most cases it should probably 
trump our own preferences, so that font developers can target a single 
pattern of behavior.

In the case of Indic 'calt', applying the feature globally by default is 
problematic because the 'calt' lookups in Nirmala appear to have been 
designed on the assumption that they will -not- take effect across 
cluster boundaries. But disabling it globally is also problematic; the 
only way we can get Nirmala to render as (apparently) intended will be 
if we restrict the 'calt' feature to apply only -within- clusters.

JK




More information about the HarfBuzz mailing list