[HarfBuzz] Fwd: [Harfbuzz-indic] Hackfest report
Arjuna Rao Chavala
arjunaraoc at gmail.com
Thu May 31 22:30:31 PDT 2012
Hi Behdad and Jonathan
On Mon, May 28, 2012 at 7:04 AM, Behdad Esfahbod <behdad at behdad.org> wrote:
> Hello HarfBuzz lists,
> I promised to write a short report about the hackfest earlier this month.
> Here's it is.
> Jonathan Kew (Mozilla) and I met at the Google Zurich offices on May 9..11
> the HarfBuzz Massala Hackfest. We got together for three days of 12+ hours
> intense hacking on the new HarfBuzz Indic shaper, using Wikipedia word
> list as
> test suite.
Thanks for the update on indic shaper. I tried out the code on Ubuntu 12.04
and was able to experiment with utility programs with Telugu strings. I am
delighted to see the framework enabling independent testing without native
knowledge of language.
> We started with the Devanagari script, testing against Uniscribe (Windows
> implementation). Initially we were failing on 35% of the words in the
> Three days, 86 commits, and dozens cups of coffee later, we got down to
> Out of the ~700,000 words, we disagree only on 560. Of those 560, many
> invalid or meaningless Devanagari sequences, not character combinations
> ever occur in correctly-spelled words. In these cases we are less
> concerned to
> precisely match Uniscribe's behavior.
> We discovered a number of bugs or peculiarities in Uniscribe. We can do
> better in some of those cases (and we do). But for testing purposes, we
> a "uniscribe-bug-compatibility" mode to the Indic shaper. The numbers
> were in that mode.
Can you let us know more details on your testing like how Uniscribe has
been setup to produce the glyph sequences and how the comparison of the
rendering results were done along with a sample extract of wordlist for one
Looking forward to seeing Telugu and other languages supported and
integrated to Pango soon.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the HarfBuzz