Charset management rework
Aleksander Morgado
aleksander at aleksander.es
Tue Feb 16 09:11:39 UTC 2021
Hey all,
The following merge request in gitlab contains a rework of how
charsets are managed in the MM daemon. Reviews and comments welcome!
https://gitlab.freedesktop.org/mobile-broadband/ModemManager/-/merge_requests/438
The charsets APIs have been changed in a way that 2 sets are given: to
work with strings and to work with bytearrays. When using UCS2, UTF-16
or GSM-7 (the charsets that support embedded NUL bytes) bytearrays
should be the default way, and for all the others, strings should be
the default way. GSM-7 is a bit of a exception because in some cases
we do allow it to be used as strings; it's the '@' character the one
encoded as a NUL byte, so we try to take care of that as best as
possible. I've reworked the whole codebase and plugins to use the new
APIs already.
The translit support, as well as the "best efforts" to get a string
decoded are now applied only under very specific cases, like when
trying to get the network operator name. There is no transliteration
or "best effort" applied when e.g. decoding a SMS that is said to be
in UTF-16; if it's not UTF-16 we e.g. won't try to "see if it's ASCII"
already instead or things like that. I'm not totally sure whether
we're breaking anything else with this logic, because unfortunately
being stricter on how the current charset is used means we could fail
to decode strings that we did decode in the old implementation thanks
to those best effort fallbacks.
The other big change is that we remove the //TRANSLIT extension
completely. This extension is specific to the gnu libc and isn't
available in other libc implementations like the musl libc. This
change makes emojis (well, all UTF-16, but emojis seem to be what
matters most! :) )work when receiving and sending SMS messages in e.g.
postmarketos based phones.
The first tests with this branch in postmarketos yield very good
results, but given it's a big change, I'm not totally sure we should
include it in the next MM 1.16. Or maybe it may make sense because
bringing fixes to the users would be quicker if it's already in the
stable branch? What do you all think?
My plan was to release MM 1.16 without this branch, and then maybe tag
a new quick MM 1.17.1 from master once this branch is merged, so that
postmarketos can use it.
--
Aleksander
https://aleksander.es
More information about the ModemManager-devel
mailing list