Performance issues with our internal memory allocator

Markus Mohrhard markus.mohrhard at googlemail.com
Sun Sep 27 23:25:05 PDT 2015


Hey,

On Sat, Sep 26, 2015 at 11:39 PM, Michael Meeks <michael.meeks at collabora.com
> wrote:

> Hi Markus,
>
> On Sat, 2015-09-26 at 20:51 +0200, Markus Mohrhard wrote:
> > so we have been running our in-build performance tests now for a few
> > weeks and recently discovered that our internal memory allocator is
> > causing spikes in the runtime.
>
>         What fun =) the irony is that it was written to avoid exactly such
> spikes (which were primarily on Windows) ;-) Thanks for finding this
> one !
>
> >  It became even worse during the weekend with the tests taking 200
> > times the instructions. Most of it seems to be spend in our memory
> > handling code and not really in the actual code. (see for example
> >
> http://perf.libreoffice.org/perf_html/ftest_of_cppu_sc_on_vm139.details.html
> with the annotated callgrind ouput at http://pastebin.com/ELC64s1n).
> > We had a profile that showed the issue inside of the memory allocator
> > much better but I have to find it again.
>
>         Interesting. We found a really silly one in the ::Interpret just
> recently, and have a simple fix - that could cause some excessive
> allocation - but I forget if Tor merged that to master (yet).
>
>         TBH - I find using kcachegrind -incredibly- more useful than the
> annotated output above.
>

Me too. The annotated output is what we currently get from the performance
testing in jenkins. So they are much better than nothing as they allow us
to look into the past by just inspecting the build logs. I'm already
incredibly thankful to Norbert for making them available as it allows me to
see what is going on the VM.


> >
> > Is the internal memory allocator really still useful despite showing
> > sometimes really bad behavior ?
>
>         I'd say not myself. My hope is that the windows allocator has also
> had
> some work done on it since ~2005? when the issue was worked around by
> mhu.
>
> >  Personally I would just fall back to the system memory allocator
> > except for the few cases where we know that it makes a difference
> > (small memory blocks in calc formula tokens, ...)
>
>         Right - it should be far quicker, particularly on Linux.
>
>         I'd love to see how a change like that impact the profiles; worth a
> commit to master and a quick revert later if there is a visible issue
> anywhere I guess =)
>

I have just committed such a change. We only have "reliable" data for linux
but if we see some huge improvement there we should at least consider
keeping it enabled on linux. The other idea that I had is that it is
related to using swap as we were surprisingly close to the RAM limit on the
VM. However after discussing this idea with Norbert I'm no longer sure if
using the swap would result in a changed callgrind IR count.


>
>         Then again - I think we're going to need a custom allocator of some
> kind (though prolly rather slow & dumb) for LibreOfficeKit
> pre-initialization for cloudy bits - so; perhaps that allocator may be
> useful in the end temporarily for that.
>

Of course there are a few places where we need a custom allocator but if it
really performs badly we might want to limit these places.

Markus

>
>         ATB,
>
>                 Michael.
>
> --
>  michael.meeks at collabora.com  <><, Pseudo Engineer, itinerant idiot
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.freedesktop.org/archives/libreoffice/attachments/20150928/b0102a81/attachment.html>


More information about the LibreOffice mailing list