Hi,<div><br></div><div>hb_icu_unicode_decompose() uses ICU's u_strlen() to get the number of Unicode codepoints in normalized buffer. However, it seems that it returns the number of UChars in the buffer. UChar is equivalent to uint16_t. This means that we can't get right number of codepoints when the buffer contains surrogate pairs. This eventually causes infinite loop during decomposition. For example, if the function is called like:</div>
<div><br></div><div> hb_codepoint_t a, b;</div><div> hb_icu_unicode_decompose(0/*unused*/, 0x1f1ef /* REGIONAL INDICATOR SYMBOL LETTER J */, &a, &b, 0/*unused*/);</div><div><br></div><div>then, it returns TRUE with *a == 0x1f1ef. This leads infinite loop in decompose(). Attached patch would fix the problem.</div>
<div><br></div><div>Thanks,</div><div><br></div>