[REVIEW 3-5] fdo#49208 ridiculous performance on certain .doc

Wed May 2 02:55:47 PDT 2012

On Tue, 2012-05-01 at 22:26 +0200, Muhammad Haggag wrote:
> Is it OK to break Unicode equivalence
> (http://en.wikipedia.org/wiki/Unicode_equivalence) by doing a memory
> comparison?

Yeah, because the existing code doesn't pay any attention to stuff like
that. Presumably we wave a magic wand somewhere and claim that
everything inside a given boundary is in some normalized form :-)

>  (I assume ICU does proper comparisons taking equivalence
> into account, hence why it's slow).

nah, it just loops over all the chars like so...

 do {
   result = ((int32_t)*(chars++) - (int32_t)*(srcChars++));
   if(result != 0) {
     return (int8_t)(result >> 15 | 1);
   }
 } while(--minLength > 0);

new code is equivalent for the equal/non-equal case

C.