[HarfBuzz] New Indic standard?

Harshula harshula at gmail.com
Fri Aug 21 21:54:00 PDT 2009


Hi,

On Tue, 2009-08-18 at 15:57 +0530, Shriramana Sharma wrote:
> Hello. This is w.r.t the mail at 
> http://lists.freedesktop.org/archives/harfbuzz/2009-August/000355.html :
> 
> Behdad Esfahbod wrote:
> 
> <quote>Well, parts of it, specially the Indic parts, are not documented. 
>   Worse, with Windows Vista, the Indic parts of the standard changed 
> completely.  We don't have any free software implementation of the new 
> Indic OpenType standard.</quote>
> 
> Where can I get a copy of this standard (new *and* old) and what is the 
> status of Indic scripts' support in Harfbuzz with this new standard? 
> What is the status w.r.t the old standard?

You will get a good idea of the major shift between the "old" and "new"
'standard' that affects all Indic scripts by reading PR-37:
http://www.unicode.org/review/pr-37.pdf

Once you understand the change, work out whether your script requires
ZWJ processing during the GSUB lookup stage. If so, then:

For Pango, check whether your script has SF_PROCESS_ZWJ flag set:

pango/modules/indic/indic-ot-class-tables.c
-------------------------------------------------
267 #define DEVA_SCRIPT_FLAGS (SF_EYELASH_RA | SF_NO_POST_BASE_LIMIT)
268 #define BENG_SCRIPT_FLAGS (SF_REPH_AFTER_BELOW | SF_NO_POST_BASE_LIMIT)
269 #define GURU_SCRIPT_FLAGS (SF_NO_POST_BASE_LIMIT)
270 #define GUJR_SCRIPT_FLAGS (SF_NO_POST_BASE_LIMIT)
271 #define ORYA_SCRIPT_FLAGS (SF_REPH_AFTER_BELOW | SF_NO_POST_BASE_LIMIT)
272 #define TAML_SCRIPT_FLAGS (SF_MPRE_FIXUP | SF_NO_POST_BASE_LIMIT)
273 #define TELU_SCRIPT_FLAGS (SF_MATRAS_AFTER_BASE | 3)
274 #define KNDA_SCRIPT_FLAGS (SF_MATRAS_AFTER_BASE | 3)
275 #define MLYM_SCRIPT_FLAGS (SF_MPRE_FIXUP | SF_NO_POST_BASE_LIMIT | SF_PROCESS_ZWJ)
276 #define SINH_SCRIPT_FLAGS (SF_NO_POST_BASE_LIMIT | SF_PROCESS_ZWJ)
-------------------------------------------------

And for ICU, check whether your script does *not* have the
SF_FILTER_ZERO_WIDTH flag set:

icu/source/layout/IndicClassTables.cpp
-------------------------------------------------
244 #define DEVA_SCRIPT_FLAGS (SF_EYELASH_RA | SF_NO_POST_BASE_LIMIT | SF_FILTER_ZERO_WIDTH)
245 #define BENG_SCRIPT_FLAGS (SF_REPH_AFTER_BELOW | SF_NO_POST_BASE_LIMIT | SF_FILTER_ZERO_WIDTH)
246 #define PUNJ_SCRIPT_FLAGS (SF_NO_POST_BASE_LIMIT | SF_FILTER_ZERO_WIDTH)
247 #define GUJR_SCRIPT_FLAGS (SF_NO_POST_BASE_LIMIT | SF_FILTER_ZERO_WIDTH)
248 #define ORYA_SCRIPT_FLAGS (SF_REPH_AFTER_BELOW | SF_NO_POST_BASE_LIMIT | SF_FILTER_ZERO_WIDTH)
249 #define TAML_SCRIPT_FLAGS (SF_MPRE_FIXUP | SF_NO_POST_BASE_LIMIT | SF_FILTER_ZERO_WIDTH)
250 #define TELU_SCRIPT_FLAGS (SF_MATRAS_AFTER_BASE | SF_FILTER_ZERO_WIDTH | 3)
251 #define KNDA_SCRIPT_FLAGS (SF_MATRAS_AFTER_BASE | SF_FILTER_ZERO_WIDTH | 3)
252 #define MLYM_SCRIPT_FLAGS (SF_MPRE_FIXUP | SF_NO_POST_BASE_LIMIT /*| SF_FILTER_ZERO_WIDTH*/)
253 #define SINH_SCRIPT_FLAGS (SF_NO_POST_BASE_LIMIT)
-------------------------------------------------

Note that SF_PROCESS_ZWJ (Pango) == !SF_FILTER_ZERO_WIDTH (ICU).
Furthermore, the fonts also have to be updated to contain the relevant
GSUB lookups. So this is not a simple overnight change.

I was finally able to convince both Pango and ICU maintainers back in
2004/2005 not to filter ZWJ from the GSUB lookup stage for Sinhala
(SINH). But I was unable to convince Indian script users to look at
their script back then. I think in 2006, Malayalam users got ZWJ not to
be filtered in Pango and ICU. There was some confusion because their
fonts were not ready, I suspect that is all sorted out now.

In 2007, on the indic at unicode.org list (see "Which Indic scripts use ZWJ
in the GSUB table?"), I was able to find out that the MS Kannada font,
along with Sinhala & Malayalam, also used ZWJ in GSUB lookups. Private
email to the relevant MS folks was the only way to get confirmation. The
other way is to see if there are ZWJ GSUBs in your script's font in MS
Vista, if you have access to the OS.

Good Luck! And feel free to contact me if you have any questions.

I haven't looked at Harfbuzz yet, but time permitting, I can help the
person working on Indic support.

cya,
#




More information about the HarfBuzz mailing list