[HarfBuzz] Cluster value in hb_glyph_info_t when codepoint is zero

Jonathan Kew jfkthame at gmail.com
Wed Apr 30 02:50:55 PDT 2014


On 30/4/14 08:30, Preet wrote:
> Hi all,
>
> I'm new to i18n and to text rendering so please bear with me if I'm
> asking something odd or obvious.
>
> After shaping a run of text with harfbuzz, you can get a codepoint
> (font glyph index) and cluster back for each glyph that remains.
>
> I want to use said cluster info to help me do line breaking later
> on... but when the font doesn't have the codepoint in question (or any
> other condition that causes harfbuzz to set hb_glyph_info_t.codepoint
> = 0 occurs), the cluster value is invalid as well. For example, the
> post shaping clusters for the following text when a Korean language
> font isn't present:
>
> input: "안녕하세요"
> clusters: 32767,32767,32767,32767,32767
>
> Is it possible to figure out which cluster the missing glyph belonged
> to (aside from the example case which is obvious)? My use case is
> implementing a custom 'missing glyph' bitmap that isn't tied a single
> font for redundancy, so I need reverse lookup even when harfbuzz
> doesn't shape a glyph.

AFAICS, hb-shape returns the expected cluster values even if the 
characters are all missing (and hence mapped to .notdef) from the font:

$ echo U+C548,U+B155,U+D558,U+C138,U+C694 | 
./test/shaping/hb-unicode-encode | BUILD/util/hb-shape 
../hb-test/fonts/sil/CharisSIL-R.ttf

returns:

[.notdef=0+1400|.notdef=1+1400|.notdef=2+1400|.notdef=3+1400|.notdef=4+1400]

with the clusters 0..4 as expected.

So there must be something different about the way you're using harfbuzz 
from how hb-shape uses it... identify that difference, and you'll 
probably have a solution.

JK



More information about the HarfBuzz mailing list