[Libreoffice-bugs] [Bug 134350] ICU locale assignment at word bounds for mixed CJK and Western text, wrong assignment for opening and closing the text run

bugzilla-daemon at bugs.documentfoundation.org bugzilla-daemon at bugs.documentfoundation.org
Wed May 26 15:01:22 UTC 2021


https://bugs.documentfoundation.org/show_bug.cgi?id=134350

Volga <shanshandehongxing at outlook.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |hiunnhue108 at ymail.com,
                   |                            |marklh9 at gmail.com

--- Comment #6 from Volga <shanshandehongxing at outlook.com> ---
(In reply to Aron Budea from comment #2)
> Already the same in 3.3.0.
There is also some long time complains for this:
https://yongweiwu.wordpress.com/2014/12/18/a-complaint-of-odfs-asian-language-support/
https://ask.libreoffice.org/en/question/19750/problem-with-full-width-asian-punctuation/

(In reply to V Stuart Foote from comment #5)
> (In reply to Volga from comment #4)
> > (In reply to V Stuart Foote from comment #3)
> > > Not sure this is incorrect.
> > > 
> > > We use ICU libs to change locale for text run at word bounds. Not sure we
> > > can then look back and change the locale for the opening word bound--here
> > > U+0020, U+2018 or U+201c--to match the locale assigned to the text run.  Or
> > > conversely change the closing word bound back when passing into the next run.
> > There is a solution posted at bug 66791, WDYT?
> 
> That does not look like a solution, to hardcode typical punctuation to
> paragraph language or document default language, and bypass ICU handling.
> But it might be more performant than capturing closing and opening
> punctuation around embedded different language text runs.
If ODF specification have instruction for this, every developer would be easy
to get solution to set the locale font face for such characters. As my
investigation, CJK fonts usually assign these punctuations in the same width as
CJK ideographs (i.e. full-width), and you need to consider more on them:
U+2013 (Chinese only), U+2014-16, U+2018-19, U+201C-1D, U+2022, U+2024-27,
U+2032-33, U+2035, U+203B.
But anyway, we also need to investigate MS Office prefer which characters to
assign to CJK fonts in documents with mixed CJK and Western texts, especially
in CJK versions of MS Office.

-- 
You are receiving this mail because:
You are the assignee for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/libreoffice-bugs/attachments/20210526/f5871297/attachment-0001.htm>


More information about the Libreoffice-bugs mailing list