[HarfBuzz] Regression in ZWJ handling for Indic CV ligatures

Behdad Esfahbod behdad at behdad.org
Thu Apr 4 11:34:04 PDT 2013


On 13-03-23 04:05 PM, Ian-Mathew Hornburg wrote:
> The recent commit a8cf7b4 seems to have fixed the regression with ZWJs
> in Indic scripts that I recently described and Khaled reported for me
> here [http://lists.freedesktop.org/archives/harfbuzz/2013-March/003035.html],
> but seems to’ve introduced another regression in Indic.
> 
> Relevant background: In Bengali and Oriya (and some other Indic
> scripts, I think, but these’re what I’m personally familiar with)
> certain special consonant-vowel combinations trigger a lookup for
> ligatures. Which ones are included differs from font to font, but good
> ones contain them, but Unicode describes a method for *not* selecting
> the ligated form as well, involving ZWJs and ZWNJs. The chapter on
> Bengali in the Unicode standard describes how the behavior is meant to
> work.
> 
> A given font can choose whether or not to use the ligated forms as the
> default for rendering. If the ligated form is the default, a ZWNJ can
> be inserted between the consonant and vowel to request the non-ligated
> form (e.g., C-ZWNJ-V). If the non-ligated form is the default
> (uncommon, but possible), a ZWJ can be inserted inbetween to
> explicitly request the ligated form. While it’s not mentioned in
> either the Unicode or OpenType standards, the Oriya script also
> contains many of these special consonant-vowel ligatures, just like
> Bengali.
> 
> While most fonts default to the ligated versions, where available,
> it’s my understanding of the two specs that ZWJs should be able to be
> included *anyway* to explicitly request the CV ligatures, even if it’s
> technically redundant. Testing with 0.9.13 showed correct behavior,
> and the inclusion of the superfluous ZWJs worked just fine with the
> Bengali and Oriya fonts I tested.

0.9.12 behaved how Uniscribe behaves (ie. doing nothing fancy about joiners).
 In 0.9.13 we tried to do something fancy, which improved many cases.
However, we also noticed that unfortunately there are fonts out there that do
things in ways that make automagical joiners impossible.  As such, in 0.9.14
we reverted to the old behavior in the Indic shaper.

So, while you can call it a regression, it really is not.  And so far we have
not found a middle-ground.  If we do, we will implement it, but for now, we
handle joiners the same way that Uniscribe does, and that's the best we can do
until we find a way to improve on that without regressing any case.


-- 
behdad
http://behdad.org/



More information about the HarfBuzz mailing list