[HarfBuzz] Indic Testing Team

Mon Sep 7 20:07:01 PDT 2009

Hi, everyone,

Pravin's draft Devanagari test case PDF prompted me to return to
working on my "indie" Indic test suite framework which I brought to
the attention of this list a couple of weeks or so ago.  Here's some
"eye candy" of a sample report for Devanagari in XHTML format:

    http://eyegene.ophthy.med.umich.edu/indic/

A few things to note:

(1) As mentioned previously on this list, the indie  framework can
produce reports in XML, JSON, TEXT, and XHTML formats.  As a glance at
the above URL hopefully reveals, the XHTML output is useful for human
viewing: The framework now automatically creates an image file for
each test case.  Presumably the XML or JSON format will be useful for
automated testing by Behdad and others once all the test cases have
been compiled.

(2) The PNG image files for each test case are generated using Pango+Cairo.

(3) I also have now added the code to get the glyph indices from Pango
(I also now have access to the glyph geometry data, but am not yet
printing out this information).

(4) I also added a "description" field so that each test case can be
annotated with a description.

I am currently using the "Chandas" font for generating the PNG images.
 In addition to some things I decided to put in myself, so far I only
have about half of Pravin's draft test cases in the report, but this
will change in another day or two.

*** SOME QUESTIONS ESPECIALLY FOR BEHDAD: ***

(1) I'm currently printing out the GlyphIds in HEX.  Let me know if
there is a preference for decimal or hex numerals for such IDs?

(2) I'm *not yet* printing out the geometry data:  width, xoffset, and
yoffset.  I assume you would want to see these numbers in normal base
10 numerals, yes?  One question though -- Should I divide by
PANGO_SCALE or use the PANGO_PIXELS() macro?  Or just display the
numbers directly?

(3) Behdad I believe you indicated something about not being sure how
to get Pango to use Uniscribe on Windows?  (I should track down the
email, but I'm lazy this evening :-).  Who should I talk to to find
out precisely how to get Pango to use Uniscribe on Windows?   (My
knowledge of compiling things on Windows is somewhat lacking
unfortunately, so I may need some help here ... ).

*** SOME QUESTIONS ESPECIALLY FOR THE INDIC EXPERTS ON THIS LIST: ***

(4) With this test framework, I can easily generate all 36x36= 1,296
half-form consonant combinations for Devanagari.  Should I do this?
Should the same large set of test cases be likewise generated for
certain of the other major Indic scripts, like Bengali?  If so, which
other scripts?

(5) Which scripts --after Devanagari-- remain "closest" to Devanagari
and thus would require a report that could be modeled on the report
for Devanagari?  Recall that the test framework is set up so that if I
want to test the same set of cases for Bengali as I have for
Devanagri, I need only add an offset to "jump" from the Devanagari
Unicode block to the Bengali Unicode block.  The code in the test
framework for this kind of thing looks like this:

         u32 += ( ka     + pData->offset );
         u32 += ( virama + pData->offset );
         u32 += ( ssa    + pData->offset );

But the question which is difficult for me to answer, because I am not
an expert in these scripts, is: Does it makes sense to run essentially
the "same" test cases for Bengali as for Devanagari?  Or not?

(6) Which scripts are "furthest" from Devanagari and thus will most
likely require customized reports which may differ significantly from
the Devanagari "template" set of test cases?

*** ONE OTHER QUESTION: ***

(7) Is there anyone on this list who understands CMake well?  I'm
getting strange errors trying to use CMAKE's generated "make" file for
the Pango and Cairo dependencies.  My manually-written makefile works
fine on Linux, but I think I will need CMake for the Windows platform.

Best - Ed

2009/9/4 प्रविण सातपुते <pravin.d.s at gmail.com>:
> Hi
>
> I have made a first draft for devanagari test cases, though not formatted
> very well
> see attachment /
> http://pravins.fedorapeople.org/first_draft_for_testing_harfbuzz.pdf
>
> please let me know your comment on this
>
> I have used magnal font on Windows Vista
>
> just missing thing is now glyph id
>
> i used http://www.microsoft.com/typography/otfntdev/devanot/features.aspx
> for creating this test cases
>
> at end of this page MS$ has given really good example, how uniscribe
> reorders characters
>
> that test case will be also useful and i will add in next draft
>
>
> Thanks,
> Pravin S
>
> 2009/8/25 Ed Trager <ed.trager at gmail.com>
>>
>> Hi, everyone,
>>
>> 2009/8/25 प्रविण सातपुते <pravin.d.s at gmail.com>:
>> > 2009/8/25 A S Alam <apreet.alam at gmail.com>
>> >>
>> >> On ਸ਼ਨਿੱਚਰਵਾਰ  22 ਅਗਸਤ 2009 07:55 ਸਵੇਰੇ, Harshula wrote:
>> >>  Kannada :
>> >> >> Gurumukhi (Punjabi) :
>> >> >
>> >> >
>> >> Would like to work for Gurmukhi (Punjabi).
>> >
>> > cool, thats look nice
>> >
>> > as per written by Behdad earlier test data should be something like,
>> >
>> >>>INPUT: U+1234,U+5678 # some comment
>> > we will get unicode characters easily by python
>> >
>> > http://pravin-s.blogspot.com/2008/09/python-tricks.html
>> >
>> >>>FONT: Some Font Name 24
>> > i think it will be nice if we use lohit fonts or only same font for
>> > testing,
>> > i am suggesting lohit since its available for all indic lanaguege, also
>> > i
>> > will be happy to quick fix any problem if from font side.
>> >
>>
>> Because of the changes that Microsoft has implemented for Indic
>> rendering in OpenType, it is recommended to first test using MS fonts
>> from Vista/Windows 7.  We can presume that the Vista/Win7 fonts have
>> been designed to render properly on the latest version Uniscribe.
>> Using these fonts, Behdad and others working on HarfBuzz can more
>> quickly achieve equivalent rendering results without having to worry
>> about bugs in the fonts themselves.  Of course there *may* be bugs
>> still present in the Vista-fonts--Uniscribe rendering pipeline, either
>> on the font or Uniscribe side of things, but those can be documented
>> from visual inspection of rendered output.
>>
>> >>OUTPUT: <1,0,0>,<5,10,30>
>> >
>> >>>Where the output tuples are glyph id, X and Y.
>> >>>Something like that can be processed into whatever format we end up
>> >>> adopting
>> >>>later.
>> >
>> > i dont know how to get this presently,
>>
>> PangoView.  Behdad has already talked about modifying PangoView to
>> produce this kind of output.  And PangoView is already cross-platform,
>> so we can use it to get both Windows and Mac rendering results, as
>> well as Linux of course.
>>
>> > i think pdf will be useful for test file format
>> >
>> > Second thing since we wanna make harfbuzz compatible with windows vista
>> > local renderer,
>> > do we should generate pdf from Windows Vista only? dont know which font
>> > lohit or with Vista local fonts only
>>
>> Output should probably be individual PNG files -- one image file for
>> each test case.  PNG files can be easily embedded in a web-based
>> resource, right along with the textual input and output data.
>>
>> >
>> > behdad can you clarify my doubt little bit
>> >
>> > Thanks & Regards,
>> > ----------------------
>> > Pravin Satpute
>> >
>> > _______________________________________________
>> > HarfBuzz mailing list
>> > HarfBuzz at lists.freedesktop.org
>> > http://lists.freedesktop.org/mailman/listinfo/harfbuzz
>> >
>> >
>> _______________________________________________
>> HarfBuzz mailing list
>> HarfBuzz at lists.freedesktop.org
>> http://lists.freedesktop.org/mailman/listinfo/harfbuzz
>
>
>
> --
> Thanks & Regards,
> ----------------------
> Pravin Satpute
>
>