[FriBidi] FriBidi and emoji modifiers

Romain Ouabdelkader romain.ouabdelkader at gmail.com
Fri Sep 18 11:41:35 PDT 2015


Thank you for your fast response,

The LRI/PDI seems like the best solution, I've already tried it without
success. I will try again with the patch applied (on the unicode63 branch).

Regards,
Romain Ouabdelkader.

2015-09-17 15:08 GMT+02:00 Dov Grobgeld <dov.grobgeld at gmail.com>:

> It is correct according to the Bidi Algorithm as both the emoji and its
> modifiers are neutral characters, and thus they inherit their direction
> from the context, and in this case become RTL. I can think off of my head
> think about a couple of options of how to resolve this:
>
> 1. Before doing directional rearrange, surround all runs of emoji like
> characters with LRI/PDI (requires non-released fribidi).
> 2. Check the run level of the emoji after the bidi algo and increase the
> run level to the nearest higher even run-level before calling fribidi
> reorder lines.
>
> I'm sure there are other options as well. Probably Behdad will come up
> with a simple solution that I hadn't thought of. :-)
>
> Regards,
> Dov
>
>
>
>
> On Thu, Sep 17, 2015 at 3:54 PM, Romain Ouabdelkader <
> romain.ouabdelkader at gmail.com> wrote:
>
>> Hi,
>>
>> I'm having some trouble to get emojis to work with FriBidi.
>> Basic emoji works fine, but when using emoji modifiers in a RTL language,
>> the modifiers end up before the emoji.
>>
>> Here I have a string with an arabic text, the emoji U+1f476 (a baby) and
>> a tone modifier U+1f3fb:
>>
>>
>> const char utf8_input[] = u8"اختبار \U0001f476\U0001f3fb";
>> int utf8_len = sizeof(utf8_input) - 1;
>>
>> std::unique_ptr<FriBidiChar[]> unicode_str(new FriBidiChar[utf8_len]);
>>
>> FriBidiCharSet utf8_charset = fribidi_parse_charset("UTF-8");
>> int len_unicode = fribidi_charset_to_unicode(utf8_charset, utf8_input,
>>                                              utf8_len, unicode_str.get());
>>
>> std::unique_ptr<FriBidiCharType[]> bidi_types(new
>> FriBidiCharType[len_unicode]);
>> std::unique_ptr<FriBidiLevel[]> levels(new FriBidiLevel[len_unicode]);
>> FriBidiParType base = FRIBIDI_PAR_ON;
>>
>> fribidi_get_bidi_types(unicode_str.get(), len_unicode, bidi_types.get());
>> fribidi_get_par_embedding_levels(bidi_types.get(), len_unicode, &base,
>> levels.get());
>>
>> fribidi_reorder_line(0, bidi_types.get(), len_unicode, 0, base,
>>                      levels.get(), unicode_str.get(), NULL);
>>
>> std::cout << std::hex;
>> for (int i = 0; i < len_unicode; ++i)
>>   {
>>     std::cout << "\\u" << unicode_str[i];
>>   }
>> std::cout << std::endl;
>>
>>
>> Output:
>> \u1f3fb\u1f476\u20\u631\u627\u628\u62a\u62e\u627
>>
>> As you can see the tone modifier U+1f3fb is first and then the emoji
>> U+1f476 is next.
>> I've also tested this with log2vis().
>>
>> Is this a bug? If not, what is the correct way to handle emojis?
>>
>> Regards,
>> Romain Ouabdelkader.
>>
>> _______________________________________________
>> fribidi mailing list
>> fribidi at lists.freedesktop.org
>> http://lists.freedesktop.org/mailman/listinfo/fribidi
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.freedesktop.org/archives/fribidi/attachments/20150918/fabed48a/attachment.html>


More information about the fribidi mailing list