[Libreoffice] [PATCH] Fix for bug / feature request 30550 - Character count without spaces

LeMoyne jlc at mail2lee.com
Wed Oct 27 16:47:48 PDT 2010


Mattias, 

No problem at all.  Recompiled with your original simpler if then statement
and I get the same counts for your reference Oasis Metadata Examples odt  
http://nabble.documentfoundation.org/file/n1783515/07-08-22-MetaData-Examples.odt
07-08-22-MetaData-Examples.odt 

>>>  Just one quick test but absolute agreement with and without the length
>>> test. 

     So, it really does seem that the len=1 strings are just the break char
and one char words must come through as char+break.  
     You were correct to just slip in the minimal fix and then look at all
the other problems.  I really bogged down in the greater context and in the
scanner weirdness.  Didn't really get less confused until doing the simple
test of switching between your patch and Cedric's patch.  
On one hand, the current method will count the same in other languages as
long as their space char has a uint val of 32.  In other words, the present
counter can't tell an upside-down exclamation point from an A: it's all
not-a-space.   On the other hand there is almost certainly implicit casting
involved in the whitespace tests (' ' == unicodeCharVar ) and that could
really break it on a different code page.  On the gripping hand I don't
really know.  It does still over-count a leading double quote (") as its own
word and I'm pretty clueless on that pre-existing condition except to
strongly suspicion the scanner ;-)   - the double quote isn't in the
whitespace list at the top of the file.
     I will try to look closer to see what the scanner is actually starting
with and giving back as it expands and breaks up the node text.  I may not
get to that for a while so please dont let me stop you.  For clarity and
completion you may want to pull the numbering/bullets stuff into line with
your fix on the main node text and just re-submit your simpler test.  
     The documentation folks will laugh if/when they find out we count
bullets as a word.  But only because they are in a good mood: they will be
happy with your patch. 
- LeMoyne

-- 
View this message in context: http://nabble.documentfoundation.org/PATCH-Fix-for-bug-feature-request-30550-Character-count-without-spaces-tp1778667p1783515.html
Sent from the Dev mailing list archive at Nabble.com.


More information about the LibreOffice mailing list