<div dir="ltr">Hey,<br><div class="gmail_extra"><br><div class="gmail_quote">On Mon, Sep 28, 2015 at 8:25 AM, Markus Mohrhard <span dir="ltr"><<a href="mailto:markus.mohrhard@googlemail.com" target="_blank">markus.mohrhard@googlemail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr">Hey,<br><div class="gmail_extra"><br><div class="gmail_quote"><span class="">On Sat, Sep 26, 2015 at 11:39 PM, Michael Meeks <span dir="ltr"><<a href="mailto:michael.meeks@collabora.com" target="_blank">michael.meeks@collabora.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Hi Markus,<br>
<span><br>
On Sat, 2015-09-26 at 20:51 +0200, Markus Mohrhard wrote:<br>
> so we have been running our in-build performance tests now for a few<br>
> weeks and recently discovered that our internal memory allocator is<br>
> causing spikes in the runtime.<br>
<br>
</span> What fun =) the irony is that it was written to avoid exactly such<br>
spikes (which were primarily on Windows) ;-) Thanks for finding this<br>
one !<br>
<span><br>
> It became even worse during the weekend with the tests taking 200<br>
> times the instructions. Most of it seems to be spend in our memory<br>
> handling code and not really in the actual code. (see for example<br>
> <a href="http://perf.libreoffice.org/perf_html/ftest_of_cppu_sc_on_vm139.details.html" rel="noreferrer" target="_blank">http://perf.libreoffice.org/perf_html/ftest_of_cppu_sc_on_vm139.details.html</a> with the annotated callgrind ouput at <a href="http://pastebin.com/ELC64s1n" rel="noreferrer" target="_blank">http://pastebin.com/ELC64s1n</a>).<br>
> We had a profile that showed the issue inside of the memory allocator<br>
> much better but I have to find it again.<br>
<br>
</span> Interesting. We found a really silly one in the ::Interpret just<br>
recently, and have a simple fix - that could cause some excessive<br>
allocation - but I forget if Tor merged that to master (yet).<br>
<br>
TBH - I find using kcachegrind -incredibly- more useful than the<br>
annotated output above.<br></blockquote><div><br></div></span><div>Me too. The annotated output is what we currently get from the performance testing in jenkins. So they are much better than nothing as they allow us to look into the past by just inspecting the build logs. I'm already incredibly thankful to Norbert for making them available as it allows me to see what is going on the VM.<br> <br></div><span class=""><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<span>><br>
> Is the internal memory allocator really still useful despite showing<br>
> sometimes really bad behavior ?<br>
<br>
</span> I'd say not myself. My hope is that the windows allocator has also had<br>
some work done on it since ~2005? when the issue was worked around by<br>
mhu.<br>
<span><br>
> Personally I would just fall back to the system memory allocator<br>
> except for the few cases where we know that it makes a difference<br>
> (small memory blocks in calc formula tokens, ...)<br>
<br>
</span> Right - it should be far quicker, particularly on Linux.<br>
<br>
I'd love to see how a change like that impact the profiles; worth a<br>
commit to master and a quick revert later if there is a visible issue<br>
anywhere I guess =)<br></blockquote><div><br></div></span><div>I have just committed such a change. We only have "reliable" data for linux but if we see some huge improvement there we should at least consider keeping it enabled on linux. The other idea that I had is that it is related to using swap as we were surprisingly close to the RAM limit on the VM. However after discussing this idea with Norbert I'm no longer sure if using the swap would result in a changed callgrind IR count.<br></div></div></div></div></blockquote><div><br><br></div><div>So it seems that the system allocator is actually even worse as can be seen at <a href="http://perf.libreoffice.org/perf_html/suite_cppu_sc.html">http://perf.libreoffice.org/perf_html/suite_cppu_sc.html</a><br><br><br></div><div>I'll leave it in for a bit as I want to see what happens if we increase the memory for the vm but I fear that the numbers show that the internal allocator is actually useful. <br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><div> <br></div><span class=""><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<br>
Then again - I think we're going to need a custom allocator of some<br>
kind (though prolly rather slow & dumb) for LibreOfficeKit<br>
pre-initialization for cloudy bits - so; perhaps that allocator may be<br>
useful in the end temporarily for that.<br></blockquote><div><br></div></span><div>Of course there are a few places where we need a custom allocator but if it really performs badly we might want to limit these places.<span class=""><font color="#888888"><br><br></font></span></div><span class=""><font color="#888888"><div>Markus <br></div></font></span><span class=""><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<br>
ATB,<br>
<br>
Michael.<br>
<span><font color="#888888"><br>
--<br>
<a href="mailto:michael.meeks@collabora.com" target="_blank">michael.meeks@collabora.com</a> <><, Pseudo Engineer, itinerant idiot<br>
<br>
</font></span></blockquote></span></div><br></div></div>
</blockquote></div><br></div></div>