[Libreoffice-commits] core.git: sal/rtl

Norbert Thiebaud nthiebaud at gmail.com
Tue Jul 7 00:00:31 PDT 2015


 sal/rtl/ustring.cxx |   24 ++++++++++++++++++------
 1 file changed, 18 insertions(+), 6 deletions(-)

New commits:
commit bb9d628552d7a91680ef04c08b1f49cee4ada6bf
Author: Norbert Thiebaud <nthiebaud at gmail.com>
Date:   Sat Jul 4 21:02:35 2015 -0500

    performance tuning of rtl_ustr_indexOfAscii_WithLength()
    
    lcov over make check showed
    
     98      4699997 : sal_Int32 rtl_ustr_indexOfAscii_WithLength(
     99              :     sal_Unicode const * str, sal_Int32 len,
     100             :     char const * subStr, sal_Int32 subLen) SAL_THROW_EXTERN_C()
     101             : {
     102     4699997 :     assert(len >= 0);
     103     4699997 :     assert(subLen >= 0);
     104     4699997 :     if (subLen > 0 && subLen <= len) {
     105             :         sal_Int32 i;
     106    54014537 :         for (i = 0; i <= len - subLen; ++i) {
     107    51036513 :             if (rtl_ustr_asciil_reverseEquals_WithLength(
     108    51036523 :                     str + i, subStr, subLen))
     109             :             {
     110      205482 :                 return i;
     111             :             }
     112             :         }
     113             :     }
     114     4494505 :     return -1;
     115             : }
    
    so
    1/ in 95% of the cases the result is not-found.. _that_ is the hot path
    2/ we are calling rtl_ustr_asciil_reverseEquals_WithLength close to 11 times
       per call.. (average ~ len - subLen, due to the high miss ratio)
    
    so let's first search for the first byte of the substring
    to optimize the 'miss' case, which is the most common one.
    
    Change-Id: I20ef0821db2ff0db5935dd562844a947a14aff64
    Reviewed-on: https://gerrit.libreoffice.org/16763
    Tested-by: Jenkins <ci at libreoffice.org>
    Reviewed-by: Stephan Bergmann <sbergman at redhat.com>

diff --git a/sal/rtl/ustring.cxx b/sal/rtl/ustring.cxx
index b31bc2f..9648fc6 100644
--- a/sal/rtl/ustring.cxx
+++ b/sal/rtl/ustring.cxx
@@ -27,6 +27,7 @@
 #include <cstdlib>
 #include <limits>
 #include <stdexcept>
+#include <string>
 
 #include <osl/diagnose.h>
 #include <osl/interlck.h>
@@ -101,14 +102,25 @@ sal_Int32 rtl_ustr_indexOfAscii_WithLength(
 {
     assert(len >= 0);
     assert(subLen >= 0);
-    if (subLen > 0 && subLen <= len) {
-        sal_Int32 i;
-        for (i = 0; i <= len - subLen; ++i) {
-            if (rtl_ustr_asciil_reverseEquals_WithLength(
-                    str + i, subStr, subLen))
+    if (subLen > 0 && subLen <= len)
+    {
+        sal_Unicode const* end = str + len;
+        sal_Unicode const* cursor = str;
+
+        while(cursor < end)
+        {
+            cursor = std::char_traits<sal_Unicode>::find(cursor, end - cursor, *subStr);
+            if(!cursor || (end - cursor < subLen))
             {
-                return i;
+                /* no enough left to actually have a match */
+                break;
+            }
+            /* now it is worth trying a full match */
+            if (rtl_ustr_asciil_reverseEquals_WithLength(cursor, subStr, subLen))
+            {
+                return cursor - str;
             }
+            cursor += 1;
         }
     }
     return -1;


More information about the Libreoffice-commits mailing list