[Harfbuzz-indic] unicode -> glyph id resolver (Re: Malayalam rendering with latest Harfbuzz)
Pravin Satpute
psatpute at redhat.com
Wed Aug 3 04:31:50 PDT 2011
On Wednesday 03 August 2011 03:32 PM, mpsuzuki at hiroshima-u.ac.jp wrote:
> On Wed, 03 Aug 2011 17:22:48 +0900
> suzuki toshiya <mpsuzuki at hiroshima-u.ac.jp> wrote:
>
>> Hi,
>>
>> Bernard Massot wrote:
>>> On Wed, Aug 03, 2011 at 12:47:12PM +0530, Pravin Satpute wrote:
>>>> I think good to add these test cases in test-complex-shape.c
>>>> like
>>>> { 0x0915, 0x094d, 0 }, -> Unicode
>>>> { 0x0080, 0x0051, 0 } -> Expected glyph ids from fonts
>>>>
>>>> Behdad any quick trick to get glyphs ids from fonts?
>>> Here is my non-quick trick : dump font in XML format with the "ttx"
>>> program and look for your glyphs in "id" attributes of <GlyphID> tags in
>>> the generated .ttx file. Then you have to convert them to hexadecimal.
>>>
>>> I'm interested in a more productive way to achieve that.
>> Excuse me, what required is a tool converting an Unicode text to
>> a serie of glyph IDs for a given font, something like:
>>
>> $ ./get-gids-of-font-by-unicode-str.exe test_font.ttf < sample.utf8
>> gid128
>> gid81
>> ...
>>
>> # There might be some discussion if the input like "U+xxxx" is better
>> # or raw Unicode text is better.
>>
>> If the glyph IDs with no consideration of OpenType layout are sufficient,
>> it is not so difficult to make such tool with FreeType2. I will try.
> Like this... there might be some bug in UTF-8 parser.
>
> /*
> *
> * cc -o make-gids-from-font-and-utf8.exe make-gids-from-font-and-utf8.c \
> * `freetype-config --cflags` `freetype-config --libs`
> *
> * echo "Hello World" \
> * | make-gids-from-font-and-utf8.exe LiberationMono-Regular.ttf
> *
> * U+0048 -> gid43
> * U+0065 -> gid72
> * U+006C -> gid79
> * U+006C -> gid79
> * U+006F -> gid82
> * U+0020 -> gid3
> * U+0057 -> gid58
> * U+006F -> gid82
> * U+0072 -> gid85
> * U+006C -> gid79
> * U+0064 -> gid71
> * U+000A -> gid0
> *
> * written by mpsuzuki at hiroshima-u.ac.jp
> *
> */
I think this requires little bit more update.
since in Indian script like Devanagari output glyph id's are different
than simple Unicode character to glyph ids mapping.
If this program can use pango and provide output as glyph id's pango
returning after applying fonts opentype tables. We can use it directly
for adding test cases in test-shape-complex.c
Example:
1) { 0x0930, 0x094d, 0x0915, 0 }
2) { 0x0080, 0x005b, 0 }
first row is unicode and second row is glyphs ids fonts suppose to
output after applying shaper. For improving harfbuzz shaper test case
this kind of program will definitely help.
please note in example row number two changes as per fonts glyph map.
Regards,
Pravin S
More information about the HarfBuzz-Indic
mailing list