ScFormulaCell size foo ...
Michael Meeks
michael.meeks at collabora.com
Wed Dec 31 02:48:48 PST 2014
Hi there,
So - poking at some Massif data (kindly generated by Matus) for a large
Calc sheet - with ~300k rows containing ~11 (repeated) formula in each
row - I was interested by the biggest (long term) allocation:
83.61% (1,732,208,505B) (heap allocation functions) malloc/new/new[], --alloc-fns, etc.
->20.54% (425,582,672B) 0x16CEE0F3: oox::xls::(anonymous namespace)::applyCellFormulas(ScDocumentImport&, oox::xls::(anonymous namespace)::CachedTokenArray&, SvNumberFormatter&, com::sun::star::uno::Sequence<com::sun::star::sheet::ExternalLinkInfo> const&, std::vector<oox::xls::FormulaBuffer::TokenAddressItem, std::allocator<oox::xls::FormulaBuffer::TokenAddressItem> > const&) (formulabuffer.cxx:210)
| ->20.54% (425,582,672B) 0x16CEE921: oox::xls::FormulaBuffer::finalizeImport() (formulabuffer.cxx:307)
| ->20.54% (425,582,672B) 0x16D59131: oox::xls::WorkbookHelper::finalizeWorkbookImport() (workbookhelper.cxx:760)
| ->20.54% (425,582,672B) 0x16D56250: oox::xls::WorkbookFragment::finalizeImport() (workbookfragment.cxx:494)
Which is essentially 425Mb of:
new ScFormulaCell(...)
I had a look at the size of ScFormulaCell which is 152 bytes
which breaks down thus (Linux / 64bit):
ScFormulaCell 152 bytes
SvtListener 56 bytes
ScFormulaResult 16 bytes
ScAddress 8 bytes
<fields> 72 bytes.
The SvtListener is 1/3rd of the size of that. Interestingly we have
~2.8m of these ScFormulaCells in my sample.
I knocked up the attached patch - assuming that we don't really use
that Listener nearly as much with the new FormulaGroup listener magic -
but ... it turns out that this increases memory usage:
* before
000000000126d000 1758056K 1758056K 1758056K 1758056K 0K rw-p [heap]
0000000001254000 1757728K 1757728K 1757728K 1757728K 0K rw-p [heap]
* after
0000000001478000 1783812K 1783812K 1783812K 1783812K 0K rw-p [heap]
000000000167a000 1782268K 1782268K 1782268K 1782268K 0K rw-p [heap]
Which is interesting. My hope is that as/when we use the formula-group
listener more effectively, that the ScFormulaCell listener will not
actually be used and this patch will start to have a useful effect :-) I
suspect that we are pushing an entry into it currently for every
single-cell reference.
My hope is that having a single-cell group listener construct for this
would mean that my tweak might actually work & save a ton of that
duplication / memory allocation. But that's for the future I guess.
Other things that look a bit wasteful are:
ScFormulaCell* pPrevious;
ScFormulaCell* pNext;
ScFormulaCell* pPreviousTrack;
ScFormulaCell* pNextTrack;
At least one pair of these (IIRC the 'Track') are needed only
transiently during calculation as an append-only list and could be
reasonably easily replaced by a bool bit-field and a std::vector on the
ScDocument itself - which would save 48Mb or so (on 64bit).
Anyhow - just some thoughts =) I'm dropping this for now myself.
ATB,
Michael.
--
michael.meeks at collabora.com <><, Pseudo Engineer, itinerant idiot
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-svl-Make-SvtListener-more-efficient-when-it-has-no-l.patch
Type: text/x-patch
Size: 4720 bytes
Desc: not available
URL: <http://lists.freedesktop.org/archives/libreoffice/attachments/20141231/bee48e04/attachment.bin>
More information about the LibreOffice
mailing list