Performance issues with our internal memory allocator
markus.mohrhard at googlemail.com
Mon Sep 28 02:11:04 PDT 2015
On Mon, Sep 28, 2015 at 8:25 AM, Markus Mohrhard <
markus.mohrhard at googlemail.com> wrote:
> On Sat, Sep 26, 2015 at 11:39 PM, Michael Meeks <
> michael.meeks at collabora.com> wrote:
>> Hi Markus,
>> On Sat, 2015-09-26 at 20:51 +0200, Markus Mohrhard wrote:
>> > so we have been running our in-build performance tests now for a few
>> > weeks and recently discovered that our internal memory allocator is
>> > causing spikes in the runtime.
>> What fun =) the irony is that it was written to avoid exactly such
>> spikes (which were primarily on Windows) ;-) Thanks for finding this
>> one !
>> > It became even worse during the weekend with the tests taking 200
>> > times the instructions. Most of it seems to be spend in our memory
>> > handling code and not really in the actual code. (see for example
>> with the annotated callgrind ouput at http://pastebin.com/ELC64s1n).
>> > We had a profile that showed the issue inside of the memory allocator
>> > much better but I have to find it again.
>> Interesting. We found a really silly one in the ::Interpret just
>> recently, and have a simple fix - that could cause some excessive
>> allocation - but I forget if Tor merged that to master (yet).
>> TBH - I find using kcachegrind -incredibly- more useful than the
>> annotated output above.
> Me too. The annotated output is what we currently get from the performance
> testing in jenkins. So they are much better than nothing as they allow us
> to look into the past by just inspecting the build logs. I'm already
> incredibly thankful to Norbert for making them available as it allows me to
> see what is going on the VM.
>> > Is the internal memory allocator really still useful despite showing
>> > sometimes really bad behavior ?
>> I'd say not myself. My hope is that the windows allocator has
>> also had
>> some work done on it since ~2005? when the issue was worked around by
>> > Personally I would just fall back to the system memory allocator
>> > except for the few cases where we know that it makes a difference
>> > (small memory blocks in calc formula tokens, ...)
>> Right - it should be far quicker, particularly on Linux.
>> I'd love to see how a change like that impact the profiles; worth
>> commit to master and a quick revert later if there is a visible issue
>> anywhere I guess =)
> I have just committed such a change. We only have "reliable" data for
> linux but if we see some huge improvement there we should at least consider
> keeping it enabled on linux. The other idea that I had is that it is
> related to using swap as we were surprisingly close to the RAM limit on the
> VM. However after discussing this idea with Norbert I'm no longer sure if
> using the swap would result in a changed callgrind IR count.
So it seems that the system allocator is actually even worse as can be seen
I'll leave it in for a bit as I want to see what happens if we increase the
memory for the vm but I fear that the numbers show that the internal
allocator is actually useful.
>> Then again - I think we're going to need a custom allocator of
>> kind (though prolly rather slow & dumb) for LibreOfficeKit
>> pre-initialization for cloudy bits - so; perhaps that allocator may be
>> useful in the end temporarily for that.
> Of course there are a few places where we need a custom allocator but if
> it really performs badly we might want to limit these places.
>> michael.meeks at collabora.com <><, Pseudo Engineer, itinerant idiot
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the LibreOffice