How to measure effects of OUString::intern ?

Michael Meeks michael.meeks at suse.com
Mon Jul 29 09:53:21 PDT 2013


Hi Mark,

On Mon, 2013-07-29 at 18:35 +0200, Mark Wielaard wrote:
> I was looking at memory usage and noticed OUString::intern being used
> in ZipFile::readCEN. This was introduced a long time ago, so I was
> wondering if it is still beneficial:

	:-)

> I couldn't immediately find the duplication of the names.
> In this case the strings are the full zip file entry paths. e.g.
> "sw/res/sidebar/pageproppanel/portraitcopy_24x24.png"

	Riight - that's interesting :-) IIRC in the past there were two chunks
of code in package/ that duplicated those names (I think). The fragment
from the (AMD) report from December 2006 shows:

	'package' zip code 
		-1022k
		+500k
	reading the large images.zip file creates a huge hash 
        table with lots of duplicated string stems – 3 days

	Of course, I couldn't tell you if this is still the case; possibly
we're no longer duplicating those strings in that way. The problem was
around 'images.zip' - the archive that has all of our icons in it for
the UI - at least back ~7 years ago ;-)

> And as far as I can see all the full path names are unique, so no
> actual sharing is taking place here. But is there a place where these
> strings are reused (and also interned)?

	Interesting; of course - we can dump the contents of the interned table
to see if they have ref-count 1 quite simply (?). 

> Replacing the intern with a normal OUString constructor like:
...
> Seems to save ~200K of memory at least for a quick:

	Nice :-) well - we should just do that then :-)

> But that might be too quick to see any effects of this intern action.

	The reason it was added was for images.zip - if the package code has
improved then we should take & save that space/time.

> So I guess my general question is how to measure the effects of
> OUString::intern?

	I'd dump the ref-count + string contents of the intern table to see if
there is more wasteage.

	You saw the OUString debugging code: RTL_LOG_STRING_NEW /
_STRING_DELETE etc. that can produce a long but crunch-able set of
printfs on stdout: many of which are sadly not that useful due to
OUStringBuffer mutation (IIRC - but presumably some more work could
clean that up).

	Thanks !

		Michael.

-- 
michael.meeks at suse.com  <><, Pseudo Engineer, itinerant idiot



More information about the LibreOffice mailing list