I would like to report a huge memory consumption in Calc.

The issue happens with a spreadsheet containing a very high number of 
cells, when computing subtotals.

My document is a spreadsheet with about 100 columns and 100,000 lines. I 
am trying to compute a single subtotal on column AA (numeric type using 
sum function) and grouping on a single text column (lets say H).

After loading the document, LibreOffice is using about 1.7gig or memory. 
When running the subtotal function, LibreOffice allocate more than 20 
extra gigs of ram, and processing is really really slow.

After analyzing the source code, it seems that the problem is located 
into the MDDS template library. The FormulaCells are stored into 
multitype vectors which are resized during processing (size is 
decreasing as long as new cells are instantiated and some internal 
vector blocks are split into several objects).  (call comes from 
ScFormulaCell* ScColumn::SetFormulaCell from column3.cxx)

Unfortunately it relies on a Vector object from STL, which does not free 
its memory when resized to a smaller size (that sounds like an 
optimization to be able to 'regrow' fast without allocating memory from 
the OS).

This could be fine, but the problem is that these objects are 
initialized to a size of number of lines (a bit more than 100 000 in my 
example) then resized to 1. Since the memory is not freed, it holds 
about 800 000 byte each time (8 byte * sizeof(double)).

For this kind of algorithm it is really not efficient, since each vector 
resize is allocating something like 800kb of extra memory which are not 
released until document is closed. Multiply this by the number of time 
the processing loop iterates, it reaches gigs of RAM pretty fast :)

Even if it may look like a memory leak, it is not really one since the 
memory will be released after the document is closed. The problem exist 
on recent versions of LO, including master.

I attach to this bug entry a proposal for a patch which solve this 
problem. A call to shrink_to_fit has been added in the resize_block 
method. In order to limit the number of call to this method, and wasting 
too much time releasing memory, i only call it when its current size is 
half of its capacity (real number of element vs number of element 


