ScFormulaCell size foo ...

Michael Meeks michael.meeks at collabora.com
Wed Dec 31 02:48:48 PST 2014


Hi there,

	So - poking at some Massif data (kindly generated by Matus) for a large
Calc sheet - with ~300k rows containing ~11 (repeated) formula in each
row - I was interested by the biggest (long term) allocation:

83.61% (1,732,208,505B) (heap allocation functions) malloc/new/new[], --alloc-fns, etc.
->20.54% (425,582,672B) 0x16CEE0F3: oox::xls::(anonymous namespace)::applyCellFormulas(ScDocumentImport&, oox::xls::(anonymous namespace)::CachedTokenArray&, SvNumberFormatter&, com::sun::star::uno::Sequence<com::sun::star::sheet::ExternalLinkInfo> const&, std::vector<oox::xls::FormulaBuffer::TokenAddressItem, std::allocator<oox::xls::FormulaBuffer::TokenAddressItem> > const&) (formulabuffer.cxx:210)
| ->20.54% (425,582,672B) 0x16CEE921: oox::xls::FormulaBuffer::finalizeImport() (formulabuffer.cxx:307)
|   ->20.54% (425,582,672B) 0x16D59131: oox::xls::WorkbookHelper::finalizeWorkbookImport() (workbookhelper.cxx:760)
|     ->20.54% (425,582,672B) 0x16D56250: oox::xls::WorkbookFragment::finalizeImport() (workbookfragment.cxx:494)

	Which is essentially 425Mb of:

		new ScFormulaCell(...)

	I had a look at the size of ScFormulaCell which is 152 bytes
which breaks down thus (Linux / 64bit):

	ScFormulaCell		152 bytes
		SvtListener	 56 bytes
		ScFormulaResult	 16 bytes
		ScAddress	  8 bytes
		<fields>	 72 bytes.

	The SvtListener is 1/3rd of the size of that. Interestingly we have
~2.8m of these ScFormulaCells in my sample.

	I knocked up the attached patch - assuming that we don't really use
that Listener nearly as much with the new FormulaGroup listener magic -
but ... it turns out that this increases memory usage:

* before
000000000126d000 1758056K 1758056K 1758056K 1758056K      0K rw-p [heap]
0000000001254000 1757728K 1757728K 1757728K 1757728K      0K rw-p [heap]

* after
0000000001478000 1783812K 1783812K 1783812K 1783812K      0K rw-p [heap]
000000000167a000 1782268K 1782268K 1782268K 1782268K      0K rw-p [heap]

	Which is interesting. My hope is that as/when we use the formula-group
listener more effectively, that the ScFormulaCell listener will not
actually be used and this patch will start to have a useful effect :-) I
suspect that we are pushing an entry into it currently for every
single-cell reference.

	My hope is that having a single-cell group listener construct for this
would mean that my tweak might actually work & save a ton of that
duplication / memory allocation. But that's for the future I guess.

	Other things that look a bit wasteful are:

    ScFormulaCell*  pPrevious;
    ScFormulaCell*  pNext;
    ScFormulaCell*  pPreviousTrack;
    ScFormulaCell*  pNextTrack;

	At least one pair of these (IIRC the 'Track') are needed only
transiently during calculation as an append-only list and could be
reasonably easily replaced by a bool bit-field and a std::vector on the
ScDocument itself - which would save 48Mb or so (on 64bit).

	Anyhow - just some thoughts =) I'm dropping this for now myself.

	ATB,

		Michael.

-- 
 michael.meeks at collabora.com  <><, Pseudo Engineer, itinerant idiot
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-svl-Make-SvtListener-more-efficient-when-it-has-no-l.patch
Type: text/x-patch
Size: 4720 bytes
Desc: not available
URL: <http://lists.freedesktop.org/archives/libreoffice/attachments/20141231/bee48e04/attachment.bin>


More information about the LibreOffice mailing list