[HarfBuzz] Contextual shaping of Malayalam post(pre)/below base forms

Richard Wordingham richard.wordingham at ntlworld.com
Wed Jun 26 15:41:13 PDT 2013

On Wed, 26 Jun 2013 11:09:46 -0400
Behdad Esfahbod <behdad at behdad.org> wrote:

> On 13-06-18 06:46 PM, Richard Wordingham wrote:
> > lookup pref_st2
> >     | y1 xx r3 | -- No sequence indices for this context!
> >     | xx r3 |
> This wouldn't work.  You should match y1 in backtrack sequence, not
> input sequence.  Otherwise this is what happens, if you have a
> sequence y1 xx r3:
>   - HB tries applying the lookup at position zero, matches the first
> rule, stops,
>   - It then advances position by one, trying to apply the rule at
> position one, now the second rule matches and ligature is formed.
> For a rule to stop a later one from applying they should apply to the
> same input sequence at the same position, ie, both should have "xx
> r3" as their input, and the first one should have more context in
> terms of backtrack / lookahead...  It can have a longer input
> sequence too, but they should start the same.

Applying the advice above, I tried

lookup pref_st2
   y1 | xx r3 |
      | xx r3 |
         0 pref_lkp1
end lookup

It worked!  That is, <YA, VIRAMA, RA> was not affected by shaping, but
was rendered as three glyphs, as intended.

However, I have two follow-up remarks, for the actual failure mechanism
is not as described by Behdad.  However, it does seem prudent to start
the input context at the virama.

Firstly, I don't understand the advancing rule.  Given an input
sequence y1 xx r3 u2 and the contexts | y1 xx r3 | and | xx r3 |, what
I had expected is:

A1) The shaper tries applying the look up at y1, matches the first rule,

A2) In accordance with the paragraph, "Once the substitutions are
complete, the client should move to the glyph position immediately
following the matched input sequence and resume the lookup process from
there", the lookup then advances to uu, finds no match, and does not
form the ligature.

Am I misunderstanding the specification?  Does the specification comply
with Uniscribe?  (I've not been having much success with my experiments
with Uniscribe, so I can't yet answer such questions myself.)

Secondly, what actually happens with my unsuccessful idea in 0.9.18 is
as follows.

B1) The shaper tries applying the lookup at y1.  It checks that:

(i)  y1 is in the coverage of the lookup of the test.  A necessary but
possibly insufficient quick test is passed.

(ii) the <pref> feature applies a substitution to the sequence <y1,
xx>.  This test fails.

There is therefore no match.

B2) The position advances by one to xx as no match was found at y1.  A
ligature is then formed, and my lookup design fails.

A bleeding context | y1 xx l3 | (for YA, VIRAMA, LA) in the <blwf>
feature would similarly not be matched by a sequence <y1, xx, l3>, for
HarfBuzz will check that y1 is subsequent to the base character.  I
believe, but have not checked, that using context y1 | xx l3 will bleed
the formation of the subscript from <xx, l3>

I haven't yet worked out how HarfBuzz distinguishes a pre-base <VIRAMA,
RA> ligature formed by the <pref> feature from a post-base <VIRAMA, RA>
ligature formed by the <pstf> feature, or indeed whether it does if a
font has both. (I don't know whether there is a deterministic style that
has both.)


More information about the HarfBuzz mailing list