[HarfBuzz] shaping of U+06C2 [uniscribe bug]

Jonathan Kew jfkthame at googlemail.com
Mon Aug 19 03:00:23 PDT 2013


A further harfbuzz-vs-uniscribe discrepancy that I'm seeing is the 
shaping of the Arabic-script character U+06C2. Although (AFAIK) this 
character is normally used only in final position, it is classified by 
the Unicode standard as dual-joining (see ArabicShaping.txt), and 
therefore causes any following letter to take a right-linking (final or 
medial) form.

In older versions of Unicode, U+06C2 was classified as right-joining, 
and so did not affect the form of a following letter. It looks like this 
change (for consistency with the behavior of its canonical decomposition 
<06C1 0654>) was made in Unicode 4.1.

Unfortunately, uniscribe still (eight years later, even on Win8) seems 
to be treating U+06C2 as right-joining (and fonts shipped with Windows 
support it as such, and lack initial/medial forms).

This means that users are liable to omit the space or non-joiner that 
should be included in a phrase such as 
<U+062F,U+0631,U+062C,U+06C2,U+062D,U+0631,U+0627,U+0631,U+062A>, as 
uniscribe interrupts the joining after 06C2 even when a letter follows. 
When such text is rendered by a Unicode 4.1-or-later system, unwanted 
cursive-joining will occur.



More information about the HarfBuzz mailing list