<div dir="ltr"><div class="gmail_extra"><div class="gmail_quote">On Thu, Jul 16, 2015 at 5:34 PM, aronsoyol <span dir="ltr"><<a href="mailto:aronsoyol@gmail.com" target="_blank">aronsoyol@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div dir="ltr"><span style="font-size:14px">Thank you very much, I got your explain for Harfbuzz,</span><div style="font-size:14px">So, according to CJK punctuations and numbser as well as other punctuations in Latin has the same script code USCRIPT_COMMON in ICU, What is the best way to classify them?</div></div></blockquote><div><br></div><div>The best way I'd recommend is to use UTR#50[1] to determine the orientation of characters, then split runs by orientation before passing to Harfbuzz.</div><div><br></div><div>Unfortunately there's no API to give the data, so you need to get the value for a character by yourself. There was a request to add to ICU, but it isn't moving much.</div><div><br></div><div>[1] <a href="http://www.unicode.org/reports/tr50/">http://www.unicode.org/reports/tr50/</a></div><div><br></div><div>/koji</div><div> </div></div></div></div>