[HarfBuzz] New Indic standard?

Ed Trager ed.trager at gmail.com
Sat Aug 22 09:02:45 PDT 2009


Hi, Harshula, Shriramana, and everyone,

Nagarjuna Venna put out a proposal in 2006 regarding handling of
deprecated / historical REPH in Telugu and Gurmukhi.  Was his solution
#2 accepted by the Unicode Consortium?

- Ed

 http://www.mit.edu/~nagarjun/Unicode/reph.pdf

On Sat, Aug 22, 2009 at 12:54 AM, Harshula<harshula at gmail.com> wrote:
> Hi,
>
> On Tue, 2009-08-18 at 15:57 +0530, Shriramana Sharma wrote:
>> Hello. This is w.r.t the mail at
>> http://lists.freedesktop.org/archives/harfbuzz/2009-August/000355.html :
>>
>> Behdad Esfahbod wrote:
>>
>> <quote>Well, parts of it, specially the Indic parts, are not documented.
>>   Worse, with Windows Vista, the Indic parts of the standard changed
>> completely.  We don't have any free software implementation of the new
>> Indic OpenType standard.</quote>
>>
>> Where can I get a copy of this standard (new *and* old) and what is the
>> status of Indic scripts' support in Harfbuzz with this new standard?
>> What is the status w.r.t the old standard?
>
> You will get a good idea of the major shift between the "old" and "new"
> 'standard' that affects all Indic scripts by reading PR-37:
> http://www.unicode.org/review/pr-37.pdf
>
> Once you understand the change, work out whether your script requires
> ZWJ processing during the GSUB lookup stage. If so, then:
>
> For Pango, check whether your script has SF_PROCESS_ZWJ flag set:
>
> pango/modules/indic/indic-ot-class-tables.c
> -------------------------------------------------
> 267 #define DEVA_SCRIPT_FLAGS (SF_EYELASH_RA | SF_NO_POST_BASE_LIMIT)
> 268 #define BENG_SCRIPT_FLAGS (SF_REPH_AFTER_BELOW | SF_NO_POST_BASE_LIMIT)
> 269 #define GURU_SCRIPT_FLAGS (SF_NO_POST_BASE_LIMIT)
> 270 #define GUJR_SCRIPT_FLAGS (SF_NO_POST_BASE_LIMIT)
> 271 #define ORYA_SCRIPT_FLAGS (SF_REPH_AFTER_BELOW | SF_NO_POST_BASE_LIMIT)
> 272 #define TAML_SCRIPT_FLAGS (SF_MPRE_FIXUP | SF_NO_POST_BASE_LIMIT)
> 273 #define TELU_SCRIPT_FLAGS (SF_MATRAS_AFTER_BASE | 3)
> 274 #define KNDA_SCRIPT_FLAGS (SF_MATRAS_AFTER_BASE | 3)
> 275 #define MLYM_SCRIPT_FLAGS (SF_MPRE_FIXUP | SF_NO_POST_BASE_LIMIT | SF_PROCESS_ZWJ)
> 276 #define SINH_SCRIPT_FLAGS (SF_NO_POST_BASE_LIMIT | SF_PROCESS_ZWJ)
> -------------------------------------------------
>
> And for ICU, check whether your script does *not* have the
> SF_FILTER_ZERO_WIDTH flag set:
>
> icu/source/layout/IndicClassTables.cpp
> -------------------------------------------------
> 244 #define DEVA_SCRIPT_FLAGS (SF_EYELASH_RA | SF_NO_POST_BASE_LIMIT | SF_FILTER_ZERO_WIDTH)
> 245 #define BENG_SCRIPT_FLAGS (SF_REPH_AFTER_BELOW | SF_NO_POST_BASE_LIMIT | SF_FILTER_ZERO_WIDTH)
> 246 #define PUNJ_SCRIPT_FLAGS (SF_NO_POST_BASE_LIMIT | SF_FILTER_ZERO_WIDTH)
> 247 #define GUJR_SCRIPT_FLAGS (SF_NO_POST_BASE_LIMIT | SF_FILTER_ZERO_WIDTH)
> 248 #define ORYA_SCRIPT_FLAGS (SF_REPH_AFTER_BELOW | SF_NO_POST_BASE_LIMIT | SF_FILTER_ZERO_WIDTH)
> 249 #define TAML_SCRIPT_FLAGS (SF_MPRE_FIXUP | SF_NO_POST_BASE_LIMIT | SF_FILTER_ZERO_WIDTH)
> 250 #define TELU_SCRIPT_FLAGS (SF_MATRAS_AFTER_BASE | SF_FILTER_ZERO_WIDTH | 3)
> 251 #define KNDA_SCRIPT_FLAGS (SF_MATRAS_AFTER_BASE | SF_FILTER_ZERO_WIDTH | 3)
> 252 #define MLYM_SCRIPT_FLAGS (SF_MPRE_FIXUP | SF_NO_POST_BASE_LIMIT /*| SF_FILTER_ZERO_WIDTH*/)
> 253 #define SINH_SCRIPT_FLAGS (SF_NO_POST_BASE_LIMIT)
> -------------------------------------------------
>
> Note that SF_PROCESS_ZWJ (Pango) == !SF_FILTER_ZERO_WIDTH (ICU).
> Furthermore, the fonts also have to be updated to contain the relevant
> GSUB lookups. So this is not a simple overnight change.
>
> I was finally able to convince both Pango and ICU maintainers back in
> 2004/2005 not to filter ZWJ from the GSUB lookup stage for Sinhala
> (SINH). But I was unable to convince Indian script users to look at
> their script back then. I think in 2006, Malayalam users got ZWJ not to
> be filtered in Pango and ICU. There was some confusion because their
> fonts were not ready, I suspect that is all sorted out now.
>
> In 2007, on the indic at unicode.org list (see "Which Indic scripts use ZWJ
> in the GSUB table?"), I was able to find out that the MS Kannada font,
> along with Sinhala & Malayalam, also used ZWJ in GSUB lookups. Private
> email to the relevant MS folks was the only way to get confirmation. The
> other way is to see if there are ZWJ GSUBs in your script's font in MS
> Vista, if you have access to the OS.
>
> Good Luck! And feel free to contact me if you have any questions.
>
> I haven't looked at Harfbuzz yet, but time permitting, I can help the
> person working on Indic support.
>
> cya,
> #
>
> _______________________________________________
> HarfBuzz mailing list
> HarfBuzz at lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/harfbuzz
>



More information about the HarfBuzz mailing list