[SC] [Performance] Calc and SubTotal performance patch submission

Sun Sep 13 12:56:44 PDT 2015

Hi,

I would like to submit a set of patch i have written in order to improve
the performances of the SubTotal functionality.

1/ What is the context

Some people i work with are using the LO SubTotal functionality on very
large spreadsheets. Let say something like 50 columns and 100 000 rows.
We noticed that the performance of this function is really below
compared to the performances of other spreadsheets like Excel.

As an example, the computation of the subtotal for a single numeric
column on such a sheet takes more than two days on a modern box (quad
core and 8 gigs RAM) !

The code analysis i did have show that most of the performance is lost
because of useless data structure iteration, increasing  processing
time, and sometimes algorithm complexity. I have created two set of
patches to fix as much as i could. The first set applies on mdds library
and is already have been merged by Kohei (thanks !). Today i submit the
second set, applying on SC code.

2/ What are the benefits ?

Now speaking about numbers... here is a view of the execution time of
the different method before and after the patch was applied. Please keep
in mind that this is only relevant for SubTotal on very large sheets. Of
course it will benefits to others function, but since the gain is high
don't expect too much magic ;)

Inserting a row at the top of a document of about 100 000 rows
Method ScDocument::InsertRow

. current version    1206.8 ms
. patch mdds            561.8 ms
. patch mdds & lo     304.1 ms

The first benefits is clearly that it runs faster :)  There is another
benefit to the current situation, it is memory consumption. The mdds
patch reduce the memory footprint. For such a large files, memory used
goes from about 8gigs down to 1.6 gigs. Without tha patch, the SubTotal
cannot end because it keeps allocating memory during the computation.

I strongly suggest you give a try to this mdds for the next release.

On the LO side there is another issue that the second set of patches is
solving. The execution duration of several method is not constant and
depend on the number of rows stored in the structure. Which is not
really the issue in itself, method complexity is linear. The issue is
that the code was slow because of the data structures were fully
iterated in some method, even if there is nothing to do.

This may be negligible when working with small sheet, but when it comes
to a processing the insertion of tens of thousand lines in a very large
file, you can reach tens or hundreds of millions of vector iteration
only to check there is nothing to do.

I tried to remove the extra iterations, and here is the results. The
following numbers compares execution time between version with mdds
patches, and mdds + LO patches. Version without patch cannot do that
much iterations. Initial condition are still a 100 000 row documents.
Iteration number or "insertion count" refers to the number of iteration
in DoSubTotal method, and to the number of lines that have been added to
the original document.

It.            mdds            mdds + Lo

1             572 ms        304 ms
500         605 ms        306 ms
1 500      645 ms        312 ms
5 000      833 ms        362 ms
12 000    1246 ms      394 ms
20 000    1927 ms      451 ms
40 000    2685 ms      718 ms
70 000    4488 ms      830 ms

As you can notice, when the number of iterations increase, the execution
time is still linear to the number of iteration, but situation is really
better. There are two main method to optimize, and the remaining work
will be more tricky.

3/ So what the patch is doing ?

The analysis of time spent in each method show a few guilty methods. The
following durations show the time spent in several method called by the
ScDocument::InsertRow method (taken for the 50 000th row insertion, it
the middle of the processing and duration are close to the overall
average duration per row)

SetAutoCalc                                          0.004 ms
TestInsertRow loop                                118.759 ms
lcl_GetFirstTabRange                             0.054 ms
UpdateBroadcastAreas loop                 127.657 ms
Boucle while UpdateReference             1099.61 ms
SetNeedsListeningGroups                     0.048 ms
InsertRow loop on column                     708.855 ms
Boucle for UpdateDrawRef                    0.306 ms
StartNeededListeners                            500.723 ms
SetDirtyIfPostponed                               510.201 ms
BroadcastRecalcOnRefMoveHandler     509.798 ms
UpdateDirtyCharts : 0.004

Overall ScDocument::InsertRow execution time 3576.07 ms. More the 3.5
seconds to insert one row... looks like it needs a little rework here.

Having a look to every method i noticed that several of the method were
dealing with formula and iterating everytime and fully the list of cells
to apply formula processing. Even if the document has no formula... 

When close to iteration number 50 000, each column contains up to 30 000
blocks allocated in the mdds backend. Each time such a method is called,
it iterates the 30 000 blocks, for each row, for each columns, thats
millions of call to vector containing no formula at all.

So i tried to add a flag the the ScColumn used to prevent iteration on
the column data if there is no formula in this column. This seems to
work and the savings were really important. I write seems to, because i
really need some review on this patch :) Unit tests are ok, and my
personal tests are ok too, but it is such a large piece of software that
there may be cases i am not aware of.

Here are the new timings with the "formula flag" (only for long durations)

TestInsertRow loop                                118.759 ms    => no change
UpdateBroadcastAreas loop                 127.657 ms    => 0.027 ms
Boucle while UpdateReference             1099.61 ms    => 2.262 ms
InsertRow loop on column                     708.855 ms    => no change
StartNeededListeners                            500.723 ms    => 1.735 ms
SetDirtyIfPostponed                               510.201 ms    => 1.624 ms
BroadcastRecalcOnRefMoveHandler     509.798 ms    => 1.603 ms

The execution time goes down by about 2.7 secs for a single row (please
remeber there is about 100 000 to process...)

Talking about the implementation now... a flag has been added to the
ScColumn, and is called mbMayHaveFormula (please feel free to rename
it). It flags the fact that the column contains at least one cell with a
formula. First of all, i am aware that there is already an existing
method, but its implementation iterates the full sheet, testing each
cell to compare its type with the formula type. Which is exactly what we
want to get rid of.

This flag is set to true each time an operation of adding or cloning a
cell is done (same for cloning rows). And this flag is set to false when
a method that iterates all the formula cell of a column find no formula.
This switch to false is currently implemented only in StartListeners.
This is correct, since if the value is true even if there is no formula.
Having a true value instead of a false not an issue. The consequence
will only be that vectors will be iterated even if there is no formulas.

But....  Having a false value when the column contains formulas is
really bad. Functions won't be called.

And this is one of the subject i'd like a have a deep review please.

In the method implantation, the chance are really simple. I first check
the flag at the beginning, and return if needed. If and only if the
method does only formula processing. Which is the case of several
methods. This is done instead of iterating the full list of cells and
check cell's type against sc::element_type_formula.

4/ What's next

First of all i look forward for your review, comment, improvement, etc :)

Then if the general idea of this patch is accepted, i'll continue my
optimization, trying to reduce the execution time of the TestInsertRow
and InsertRow methods. The optimization is back to mdds, since it is
about the get_block_position and insert_empty methods.

5/ Gerrit references

Sorry i was not aware that all my local commits would become a distinct
gerrit entry... first time use :(  next time i'll group them

https://gerrit.libreoffice.org/18540
https://gerrit.libreoffice.org/18541
https://gerrit.libreoffice.org/18542
https://gerrit.libreoffice.org/18543
https://gerrit.libreoffice.org/18544
https://gerrit.libreoffice.org/18545
https://gerrit.libreoffice.org/18546
https://gerrit.libreoffice.org/18547
https://gerrit.libreoffice.org/18548
https://gerrit.libreoffice.org/18549
https://gerrit.libreoffice.org/18550
https://gerrit.libreoffice.org/18551
https://gerrit.libreoffice.org/18552
https://gerrit.libreoffice.org/18553

Cheers
W.

-- 
William                             http://www.wbonnet.net

http://france.debian.net            Association Debian France
http://www.opencsw.org              Community SoftWare for Solaris