<div>
Thanks for the quick reply. It works perfectly.
</div>
<div><div><br></div><div>-- </div><div>Lóci</div><div><br></div></div>
<p style="color: #A0A0A8;">On Sunday, October 14, 2012 at 7:38 PM, Behdad Esfahbod wrote:</p>
<blockquote type="cite" style="border-left-style:solid;border-width:1px;margin-left:0px;padding-left:10px;">
<span><div><div><div>On 12-10-14 12:31 PM, Lóránt Pintér wrote:</div><blockquote type="cite"><div><div>Hi,</div><div><br></div><div>I'm trying to shape the word "tér" with HarfBuzz, and this is what I get back:</div><div><br></div><div>hb_buffer_get_glyph_infos() after calling hb_buffer_add_utf8():</div><div><br></div><div>Char #0: { codepoint: 116, mask: 1, cluster: 0, var1: 0, var2: 0 }</div><div>Char #1: { codepoint: 233, mask: 1, cluster: 1, var1: 0, var2: 0 }</div><div>Char #2: { codepoint: 114, mask: 1, cluster: 3, var1: 0, var2: 0 }</div><div><br></div><div>…and after calling hb_shape():</div><div><br></div><div>Glyph #0: { codepoint: 86, mask: 1, cluster: 0, var1: 2, var2: 5 }</div><div>Glyph #1: { codepoint: 156, mask: 1, cluster: 1, var1: 2, var2: 5 }</div><div>Glyph #2: { codepoint: 84, mask: 1, cluster: 3, var1: 2, var2: 5 }</div><div><br></div><div>I believed up to now that each cluster corresponded to a character in the</div><div>original string. Why is the letter "é" turned into two clusters here?</div></div></blockquote><div><br></div><div>When you use add_utf8, cluster values are set to UTF-8 indices into the</div><div>original string. The precomposed "é" letter takes two bytes in UTF-8, that's</div><div>why you see what you see. If you prefer plain character-index instead, just</div><div>loop over and set the cluster values before calling shape. This is from</div><div>hb/util/options.hh for example:</div><div><br></div><div> if (!utf8_clusters) {</div><div> /* Reset cluster values to refer to Unicode character index</div><div> * instead of UTF-8 index. */</div><div> unsigned int num_glyphs = hb_buffer_get_length (buffer);</div><div> hb_glyph_info_t *info = hb_buffer_get_glyph_infos (buffer, NULL);</div><div> for (unsigned int i = 0; i < num_glyphs; i++)</div><div> {</div><div> info->cluster = i;</div><div> info++;</div><div> }</div><div> }</div><div><br></div><div>behdad</div></div></div></span>
</blockquote>
<div>
<br>
</div>