[HarfBuzz] Trouble with clusters and accented latin characters

Lóránt Pintér lorant.pinter at prezi.com
Sun Oct 14 11:29:48 PDT 2012


Thanks for the quick reply. It works perfectly.  

--  
Lóci


On Sunday, October 14, 2012 at 7:38 PM, Behdad Esfahbod wrote:

> On 12-10-14 12:31 PM, Lóránt Pintér wrote:
> > Hi,
> >  
> > I'm trying to shape the word "tér" with HarfBuzz, and this is what I get back:
> >  
> > hb_buffer_get_glyph_infos() after calling hb_buffer_add_utf8():
> >  
> > Char #0: { codepoint: 116, mask: 1, cluster: 0, var1: 0, var2: 0 }
> > Char #1: { codepoint: 233, mask: 1, cluster: 1, var1: 0, var2: 0 }
> > Char #2: { codepoint: 114, mask: 1, cluster: 3, var1: 0, var2: 0 }
> >  
> > …and after calling hb_shape():
> >  
> > Glyph #0: { codepoint: 86, mask: 1, cluster: 0, var1: 2, var2: 5 }
> > Glyph #1: { codepoint: 156, mask: 1, cluster: 1, var1: 2, var2: 5 }
> > Glyph #2: { codepoint: 84, mask: 1, cluster: 3, var1: 2, var2: 5 }
> >  
> > I believed up to now that each cluster corresponded to a character in the
> > original string. Why is the letter "é" turned into two clusters here?
> >  
>  
>  
> When you use add_utf8, cluster values are set to UTF-8 indices into the
> original string. The precomposed "é" letter takes two bytes in UTF-8, that's
> why you see what you see. If you prefer plain character-index instead, just
> loop over and set the cluster values before calling shape. This is from
> hb/util/options.hh for example:
>  
> if (!utf8_clusters) {
> /* Reset cluster values to refer to Unicode character index
> * instead of UTF-8 index. */
> unsigned int num_glyphs = hb_buffer_get_length (buffer);
> hb_glyph_info_t *info = hb_buffer_get_glyph_infos (buffer, NULL);
> for (unsigned int i = 0; i < num_glyphs; i++)
> {
> info->cluster = i;
> info++;
> }
> }
>  
> behdad  

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.freedesktop.org/archives/harfbuzz/attachments/20121014/07cf1a1a/attachment.html>


More information about the HarfBuzz mailing list