[Libreoffice-bugs] [Bug 71329] No linebreak between Latin text and Ideographic punctuation
bugzilla-daemon at bugs.documentfoundation.org
bugzilla-daemon at bugs.documentfoundation.org
Sat Dec 9 06:09:07 UTC 2017
https://bugs.documentfoundation.org/show_bug.cgi?id=71329
--- Comment #11 from Mark Hung <marklh9 at gmail.com> ---
(In reply to Mark Hung from comment #10)
> This is still an issue in 5.3 - Phrases separated by full-width comma or
> full-width dot are treated as one single word, hence it is put to the next
> line.
Correction:
In [1] we detect the language of the last portion to determine the locale for
the break iterator. The document under test has "en_US" there and the Unicode
break iterator found the incorrect word boundary.
There are few issues:
1. The heuristic rule is wrong in this case.
2. Unicode break iterator didn't break before ideographic punctuation.
3. The word breaking algorithm in UAX29[2] should work for us. Why do we need
break iterators for three scripts?
[1]
https://cgit.freedesktop.org/libreoffice/core/tree/sw/source/core/text/guess.cxx#n355
[2] http://unicode.org/reports/tr29/#WB5
--
You are receiving this mail because:
You are the assignee for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/libreoffice-bugs/attachments/20171209/345c7980/attachment.html>
More information about the Libreoffice-bugs
mailing list