[HarfBuzz] Normalization-aware font fallback (was Re: HarfBuzz 1.0 API; the message you were hoping would never come
Behdad Esfahbod
behdad at behdad.org
Wed Aug 6 14:14:47 PDT 2014
On 14-01-03 05:07 AM, Jonathan Kew wrote:
>>
>> This makes me realize that I don't understand the big picture of how
>> this fallback process interacts with harfbuzz. In order to do fallback,
>> you need to do character to glyph mapping.
>
> Not necessarily. You need to know the character repertoire supported by the
> font, but you may not need to actually map to glyphs. In Firefox, for
> instance, font fallback is done based on a per-font *bit* map of supported
> Unicode codepoints. So at the font fallback stage, we know whether the
> character is present, but do not map it to a glyph.
When we had this discussion back in January I started putting a hack together,
I just got to get it working. I've pushed it in the hb-fc branch of my github
repo:
https://github.com/behdad/harfbuzz/commits/hb-fc
What it does is to introduce a (not public yet) hb-fc.h header:
https://github.com/behdad/harfbuzz/blob/hb-fc/util/hb-fc.h
And a cmdline tool called hb-fc-list:
https://github.com/behdad/harfbuzz/blob/hb-fc/util/hb-fc-list.c
What hb-fc-list does is that it lists (ala fc-list) all fonts that can render
a given string using hb_shape(). Ie. it takes HarfBuzz's normalization
process into account.
I haven't tested it for tricky cases. The source code itself is the best
documentation at this point:
https://github.com/behdad/harfbuzz/blob/hb-fc/util/hb-fc.cc
(Just filed this bug re variation-selectors support in fontconfig:
https://bugs.freedesktop.org/show_bug.cgi?id=82266 )
Here's a run:
behdad:util 0$ time fc-list | wc -l
562
real 0m0.022s
user 0m0.014s
sys 0m0.008s
behdad:util 0$ time ./hb-fc-list حرفباز | wc -l
59
real 0m0.043s
user 0m0.030s
sys 0m0.017s
Note that there's a ZWNJ in that string. If I just query fc-list for fonts
that cover all the characters in that string, it doesn't list fonts that don't
map ZWNJ, even though they are perfectly fine for shaping:
0$ time fc-list :charset=062D,0631,0641,200C,0628,0627,0632 | wc -l
39
real 0m0.021s
user 0m0.010s
sys 0m0.008s
Thoughts?
--
behdad
http://behdad.org/
More information about the HarfBuzz
mailing list