[Libreoffice-bugs] [Bug 140796] [UI]Writer: Wrong English string for U+2060 character
bugzilla-daemon at bugs.documentfoundation.org
bugzilla-daemon at bugs.documentfoundation.org
Sat Mar 6 08:09:06 UTC 2021
https://bugs.documentfoundation.org/show_bug.cgi?id=140796
Ming Hua <ming.v.hua at qq.com> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |ming.v.hua at qq.com
--- Comment #11 from Ming Hua <ming.v.hua at qq.com> ---
(In reply to Julien Nabet from comment #8)
> I wonder if it's not the unicode which should be changed in these for
> u'\xFEFF'
>
> IMHO I think new variables should be created for u'\x2060'
In addition to Pierre-Yves's objection based on actual usage, here is some
background information:
According to Wikipedia [1] and Unicode's FAQ [2], before Unicode version 3.2
(released in 2002), U+FEFF had been called ZERO WIDTH NO-BREAK SPACE and used
both at beginning of a data stream to indicate byte order, or in the middle of
a data stream to adjust line-breaking.
Unicode 3.2 deprecated the latter usage of U+FEFF and renamed it to BYTE ORDER
MARK. It also created U+2060 WORD JOINER for the latter usage and encourage
people to use it instead of U+FEFF in the middle of a data stream.
So it seems LibreOffice is just using an deprecated nomenclature in both code
and UI. The most likely scenario is that the character inserted into documents
probably was changed from U+FEFF to U+2060 some time after 2002, but the
variable names and UI string were not.
1. https://en.wikipedia.org/wiki/Word_joiner and
https://en.wikipedia.org/wiki/Byte_order_mark
2. https://www.unicode.org/faq/utf_bom.html#bom6
--
You are receiving this mail because:
You are the assignee for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/libreoffice-bugs/attachments/20210306/c1fbf5e5/attachment.htm>
More information about the Libreoffice-bugs
mailing list