[Libreoffice] [PATCH] Fix for bug / feature request 30550 - Character count without spaces
jlc at mail2lee.com
Wed Oct 27 08:55:27 PDT 2010
I have looked at this problem and agree whole heartedly with your basic fix.
I was preparing to patch something similar :-) I am *very* glad to see this
basic feature addition get in this week.
As to the question of (some!?) docs not showing the correct count upon open
- I had noticed that but was not able to reproduce it reliably. I too am
new to the LibreOffice code base and the question of where the word count is
done in the background is not blindingly obvious. Also, one of the
OpenOffice issue threads (i#100629) mentions the regression you saw in your
testing versus 3.2.
I think there are a couple of issues involved in the inaccuracies reported
by you and at OOo. What you see where the count changes after touching the
document looks like a count init or background counting problem. I think
this problem could be due to changes made in Libre Office related to timer
fixes. If it is a timer related problem it may become more obvious with
larger test docs.
The SwScanner did have issues with leading spaces (used to give an extra
word) and I am pretty sure that is why the length test (aScanner.GetLen() >
1) was placed there. I think leading spaces were counted as a word because
an initial empty string was returned by the scanner. I am not sure if it is
necessary any more given that the scanner now internally skips leading white
space and returns false if the string is empty. Every time I tried to clean
up the code around the scanner the count went way off. My take is that the
scanner is not working well or it would not need the crutch code. The
SwScanner appears to be intended for a different purpose (comments talk
about scripts alot) and it called in only one other place. It is working
way too hard to produce a simple count.
The OOo counting mechanism had pre-existing problems with word count
(i#89042 and i#100629) related to leading quotation marks and special
characters. The quoted string "word" is counted as 2 words. AFAICT hidden
paragraphs (redlines?) and notes are not counted. This counting mechanism
has always counted isolated punctuation as a word.
View this message in context: http://nabble.documentfoundation.org/PATCH-Fix-for-bug-feature-request-30550-Character-count-without-spaces-tp1778667p1780938.html
Sent from the Dev mailing list archive at Nabble.com.
More information about the LibreOffice