[Libreoffice-bugs] [Bug 114760] New: Word Count of Chinese mixed with English text

bugzilla-daemon at bugs.documentfoundation.org bugzilla-daemon at bugs.documentfoundation.org
Sat Dec 30 02:26:03 UTC 2017


https://bugs.documentfoundation.org/show_bug.cgi?id=114760

            Bug ID: 114760
           Summary: Word Count of Chinese mixed with English text
           Product: LibreOffice
           Version: 6.1.0.0.alpha0+ Master
          Hardware: All
                OS: All
            Status: UNCONFIRMED
          Severity: normal
          Priority: medium
         Component: LibreOffice
          Assignee: libreoffice-bugs at lists.freedesktop.org
          Reporter: pswo10680 at gmail.com

Description:
In Word Count dialogue, there is a "Words" count section. 
It counts English text for words without symbols while it counts Chinese text
for characters AND symbols.

In Chinese text we have 2 counting ways: one to count Chinese characters and
symbols, and the other to count only Chinese characters (no symbols). The
previous one method counting Chinese symbols is much more popular in press.

So when we are counting a text document including Chinese text and English
text, we add the Word count of English (not counting symbols) and the Word
count of Chinese (either counting symbols or not) together.

The "Words" count in LibreOffice now uses the first method above to count
English "words" and "Chinese characters and Chinese symbols." I think that is
confusing because we see "phonogram words" equal to "Chinese characters."

"Words count" should be divided into 
1. Words => be corrected by only counting words and Chinese characters.
2. Words and Chinese symbols => the method we use for Words count now.

Steps to Reproduce:
1. Open Writer
2. Copy paste "Hello, world! 世界,你好!"
3. Select Tools > Word Count to see the stats

Actual Results:  
1. Words: 8
2. Characters including spaces: 20
3. Characters excluding spaces: 18
4. Asian characters and Korean syllables: 6

Expected Results:
In "Hello, world! 世界,你好!" sentence, there are 2 English words (Hello world), 4
Chinese characters (世界你好), 4 symbols (,!,!), 2 Chinese symbols (,!) and 2
spaces.

1. Words: 6 => Should be corrected as "Words" not including symbols 
2. Words and Chinese symbols: 8 => What the Words count method now
3. Words and symbols: 10
4. Characters including spaces: 20
5. Characters excluding spaces: 18
6. Asian characters and Korean syllables: 6


Reproducible: Always


User Profile Reset: No



Additional Info:


User-Agent: Mozilla/5.0 (X11; Fedora; Linux x86_64; rv:57.0) Gecko/20100101
Firefox/57.0

-- 
You are receiving this mail because:
You are the assignee for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/libreoffice-bugs/attachments/20171230/0a44951b/attachment.html>


More information about the Libreoffice-bugs mailing list