How to measure effects of OUString::intern ?
michael.meeks at suse.com
Mon Jul 29 09:53:21 PDT 2013
On Mon, 2013-07-29 at 18:35 +0200, Mark Wielaard wrote:
> I was looking at memory usage and noticed OUString::intern being used
> in ZipFile::readCEN. This was introduced a long time ago, so I was
> wondering if it is still beneficial:
> I couldn't immediately find the duplication of the names.
> In this case the strings are the full zip file entry paths. e.g.
Riight - that's interesting :-) IIRC in the past there were two chunks
of code in package/ that duplicated those names (I think). The fragment
from the (AMD) report from December 2006 shows:
'package' zip code
reading the large images.zip file creates a huge hash
table with lots of duplicated string stems – 3 days
Of course, I couldn't tell you if this is still the case; possibly
we're no longer duplicating those strings in that way. The problem was
around 'images.zip' - the archive that has all of our icons in it for
the UI - at least back ~7 years ago ;-)
> And as far as I can see all the full path names are unique, so no
> actual sharing is taking place here. But is there a place where these
> strings are reused (and also interned)?
Interesting; of course - we can dump the contents of the interned table
to see if they have ref-count 1 quite simply (?).
> Replacing the intern with a normal OUString constructor like:
> Seems to save ~200K of memory at least for a quick:
Nice :-) well - we should just do that then :-)
> But that might be too quick to see any effects of this intern action.
The reason it was added was for images.zip - if the package code has
improved then we should take & save that space/time.
> So I guess my general question is how to measure the effects of
I'd dump the ref-count + string contents of the intern table to see if
there is more wasteage.
You saw the OUString debugging code: RTL_LOG_STRING_NEW /
_STRING_DELETE etc. that can produce a long but crunch-able set of
printfs on stdout: many of which are sadly not that useful due to
OUStringBuffer mutation (IIRC - but presumably some more work could
clean that up).
michael.meeks at suse.com <><, Pseudo Engineer, itinerant idiot
More information about the LibreOffice