[Libreoffice-bugs] [Bug 138591] Using Unicode conversion on a combined emoji results in only partial conversion

Wed Dec 2 09:54:38 UTC 2020

https://bugs.documentfoundation.org/show_bug.cgi?id=138591

--- Comment #4 from Justin L <jluth at mail.com> ---
This might be VERY complex to do completely correctly, since there does not
seem to be a single standard way of marking combining combinations of emojis.

The latest version of the "Unicode Emoji" spec can be found at
http://www.unicode.org/reports/tr51/.

Having glanced through the spec, I imagine adding some kind of logic like:
if ( maInput.getLength() == 0 )
    bIsEmojiSequence = isEmoji();
if ( isEmoji_modifier_base() )
    bHaveEmoji_modifier_base = true;
const nZWJ ==  fe0f; //Zero Width Joiner character

if ( bIsEmojiSquence )
{
    if ( next == nZWJ || (isEmoji(next) && !bHaveEmoji_modifier_base)  )
    then continue to accept new characters.
}

It looks like this will require some low-level identification of emoji, since
there is no classification yet such as
::com::sun::star::i18n::UnicodeType::EMOJI

-- 
You are receiving this mail because:
You are the assignee for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/libreoffice-bugs/attachments/20201202/6e147d95/attachment.htm>