[HarfBuzz] New Indic standard?
gora at sarai.net
Wed Aug 19 13:33:30 PDT 2009
On Wed, 19 Aug 2009 12:38:38 -0400
Ed Trager <ed.trager at gmail.com> wrote:
> > What does such a test suite involve? In the past, we have
> > prepared a list of base characters, plus allowed conjuncts
> > (along with example words) for some Indic languages in ICU.
> > Along with these, we have prepared screenshots of the expected
> > rendering, which can be compared to Harfbuzz rendering. Does
> > that suffice?
> Is the pre-existing list of base characters plus allowed conjucts
> that you prepared for ICU testing comprehensive?
For the languages that we covered at the time, namely Hindi
and Oriya, yes the coverage was 100% at least to the
extent of our linguistic capabilities. It should be possible
for us to make a concerted effort to do the same for other
> Does it cover ALL the "major" languages commonly written using
> one of the Indic scripts, or just some of them?
For a given script, there should only be small differences in
the conjunct list between languages. For a first cut, we can
go with the major language covered by the script, e.g., Hindi
for the Devanagari script.
> Are there annotations indicating, for example, conjuncts that are
> specifically allowed for some languages (say, perhaps classical
> Sanskrit) but not allowed or deprecated or considered
> old-fashioned etc. for some other languages (say, Modern Hindi)?
Not at the moment. This is a great idea, but I do think that it is
a second-level issue, and we should start with the basics.
> Ideally we need to know more details about what you or anyone
> else has available before the question of "sufficiency" can be
> settled ...
> Where is the URL for the ICU test suite that you mention? I would
> like to look at that, as I am sure others would too.
I recall it having been included in the ICU source code. Will
have to check out the code, and find it. I will need to do that
tomorrow, as it is now quite late here.
> Having a
> test suite that is publically available would be a great first
> step. Setting up such a resource so that people could
> contribute / edit / add additional test cases would be a great
> next step.
OK, we can do that at least as far as listing the conjuncts goes.
In fact, the rudiments are probably somewhere on the IndLinux Wiki.
Will find a pointer to that tomorrow, and we would be glad to set
up a resource such as the one you outline above. We would need to
discuss things further if you want to somehow set up automated test
Thanks for your prompt and detailed reply.
More information about the HarfBuzz