[HarfBuzz] question regarding cluster indices

Phil Race philip.race at oracle.com
Tue Sep 29 11:32:16 PDT 2015


hb_buffer_add_XXX allows you to specify a subset of the text to shape
with the remainder being used as context but is not shaped itself and is
not part of the output.

This is useful for various cases, for example if you are using different
fonts for different parts of the text.

I want to make sure I understand correctly how this impacts the
assigned output cluster for the portion of the text being shaped.

The code below shows the initial assignment of clusters based on
index of the code point in the full text.
So on output the cluster of the text that was shaped will start
at the offset within the overall text.
ie if  the full text is "ABCDEF" and we shape "DEF" then the
output cluster indices will start with 3. i.e I can always just
character count if I want to know what the cluster index
would have been without such context. Is this interpretation correct ?

hb_buffer_add_utf(hb_buffer_t  *buffer,
                    const typename utf_t::codepoint_t *text,
                    int           text_length,
                    unsigned int  item_offset,
                    int           item_length) {

.....
while (next < end)
   {
     hb_codepoint_t u;
     const T *old_next = next;
     next = utf_t::next (next, end, &u, replacement);
     buffer->add (u, old_next - (const T *) text);
   }
...
}


More information about the HarfBuzz mailing list