Patch to huge memory consumption in LO Calc
William Bonnet
wbonnet at linagora.com
Sat May 24 05:47:43 PDT 2014
Hi Kohei, hi all,
I would like to report a huge memory consumption in Calc.
The issue happens with a spreadsheet containing a very high number of
cells, when computing subtotals.
My document is a spreadsheet with about 100 columns and 100,000 lines. I
am trying to compute a single subtotal on column AA (numeric type using
sum function) and grouping on a single text column (lets say H).
After loading the document, LibreOffice is using about 1.7gig or memory.
When running the subtotal function, LibreOffice allocate more than 20
extra gigs of ram, and processing is really really slow.
After analyzing the source code, it seems that the problem is located
into the MDDS template library. The FormulaCells are stored into
multitype vectors which are resized during processing (size is
decreasing as long as new cells are instantiated and some internal
vector blocks are split into several objects). (call comes from
ScFormulaCell* ScColumn::SetFormulaCell from column3.cxx)
Unfortunately it relies on a Vector object from STL, which does not free
its memory when resized to a smaller size (that sounds like an
optimization to be able to 'regrow' fast without allocating memory from
the OS).
This could be fine, but the problem is that these objects are
initialized to a size of number of lines (a bit more than 100 000 in my
example) then resized to 1. Since the memory is not freed, it holds
about 800 000 byte each time (8 byte * sizeof(double)).
For this kind of algorithm it is really not efficient, since each vector
resize is allocating something like 800kb of extra memory which are not
released until document is closed. Multiply this by the number of time
the processing loop iterates, it reaches gigs of RAM pretty fast :)
Even if it may look like a memory leak, it is not really one since the
memory will be released after the document is closed. The problem exist
on recent versions of LO, including master.
I attach to this bug entry a proposal for a patch which solve this
problem. A call to shrink_to_fit has been added in the resize_block
method. In order to limit the number of call to this method, and wasting
too much time releasing memory, i only call it when its current size is
half of its capacity (real number of element vs number of element
allocated).
Cheers
W.
--
William BONNET
Directeur Technique / CTO LINAGORA
Linagora 80 rue Roque de Fillol / Puteaux 92800 F
Tél. +33 (0)810 251 251
GSM +33 (0)689 376 977
Twitter @wbonnet
http://www.linagora.com/ | http://www.08000linux.com/
Découvrez OBM, La messagerie Libre : http://www.obm.org/
La présente transmission contient des informations confidentielles appartenant à Linagora, exclusivement destinées au(x) destinataire(s) identifié(s) ci-dessus. Si vous n'en faites pas partie, toute reproduction, distribution ou divulgation de tout ou partie des informations de cette transmission, ou toute action effectuée sur la base de celles-ci vous sont formellement interdites. Si vous avez reçu cette transmission par erreur, nous vous remercions de nous en avertir et de la détruire de votre système d'information.
The present transmission contains privileged and confidential information belonging to Linagora, exclusively intended for the recipient(s) thereabove identified. If you are not one of these aforementioned recipients, any reproduction, distribution, disclosure of said information in whole or in part, as well as any action undertaken on the basis of said information are strictly prohibited. If you received the present transmission by mistake, please inform us and destroy it from your messenging and information systems.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: multi_type_vector_types.hpp.patch
Type: text/x-patch
Size: 715 bytes
Desc: not available
URL: <http://lists.freedesktop.org/archives/libreoffice/attachments/20140524/13e70e4a/attachment.bin>
More information about the LibreOffice
mailing list