Exception cost on string allocations ...
sbergman at redhat.com
Fri Mar 2 01:14:07 PST 2012
On 02/29/2012 05:29 PM, Michael Meeks wrote:
> So - I did some more analysis, and re-compiled all of LibreOffice both
> with and without the attached patch - analysing all 287 shared libraries
> we get:
> $ du -b -c ../../without-stripped/program/*.so | tail
> 171373684 total
> $ du -b -c ../../with-stripped/program/*.so | tail
> 174692100 total
> ie. we have as a lower bound a 1.8% size saving. I expect that to grow
> as we start to LTO - as we can finally propagate the fact that methods
> don't throw exceptions across modules, and across libraries.
I wouldn't expect too much benefit from a potentially better nothrow
analysis. Looking at the C++ standard library, there are lots of
functions that can throw, not just new. And using exceptions instead of
error return codes can also improve performance, by removing the need
for explicit return code checks in the code. (Not to mention that using
exceptions, especially ones like std::bad_alloc that are typically
intended to just abort your application, can simplify the source code,
something that is IMO of equal importance to raw code size/performance.)
> Now - the immediate, and expected riposte is that ~2% of our size is
> insignificant ( to go along with the 3% of our CPU time at startup
> parsing labels we mostly don't need ;-). Personally, I don't think the
> approach of ignoring any efficiency win as insignificant if it is less
> than 10% across a code-base of our size is sustainable in the long
> term :-) the more we ignore the more it adds up.
> So here are the facts, correct me if I'm wrong:
> * often std::bad_alloc is not handled, making the program
> abort, the exception is not caught by main - there is no
> autosave and data is lost - ie. a SEGV would be preferable.
That's wrong (as we already discussed elsewhere), an uncaught exception
(leading to std::terminate -> std::abort -> SIGABRT) behaves virtually
the same as SIGSEGV wrt our crash handling.
> * often we fail to emit std::bad_alloc in out of memory
Yes, our code base contains errors. Lots of them. ;)
> conditions, cf. sal/inc/rtl/allocator.hxx etc. etc.
Cf. the comment at Allocator::allocate. That code probably predates the
invention of the #if defined EXCEPTIONS_OFF workaround used elsewhere.
> * it is unarguably the case that there is no consistent
> bad_alloc handling, it is random, ill thought through and
> anything less than aborting that complete document load/save/
> calculate operation will almost certainly lead to (silent)
> data loss / document errors.
Sure, the current code base, grown over decades and written by people
with substantially diverging viewpoints, is a mess. One can only hope
that exceptions like std::bad_alloc consistently make it to
std::terminate without "helpfully" being caught by some clueless catch
(...) before, so that they can lead to a nice SIGABRT and following
crash handling (see above).
So, pragmatically, reacting on rtl_allocateMemory() == 0 with std::abort
is probably ~just as good as reacting with std::bad_alloc (as I already
said in this thread). What I object to is simply ignoring the return
value of rtl_allocateMemory, and relying on some subsequent pointer
dereferencing leading to SIGSEGV (as it complicates crash analysis,
gives a wrong impression about our competence, etc.).
(IMO, the advantage of using std::bad_alloc over std::abort in low level
building blocks like rtl::OUString is that it /potentially/ allows the
call site to decide whether or not to react on it, cf. Caolán's example.
But, indeed, that's probably a minor point as long as our overall code
base has more pressing problems, and one that can eventually be corrected.)
> * the user experience of: "we're out of memory, by the way we
> prolly lost some of your document, please try to save,
> re-start compare the data etc." is insignificantly less
> painful than taking a crash.
Nobody suggested that OOM should lead to such a more specific user
experience than your garden-variety crash. (Again: the std::bad_alloc
is intended for consumption by clueful-enough layers -- and whether
there can be such, at least in current LO, is of course debatable. All
others shall let it pass, leading to SIGABRT and normal crash handling,
> * in cases where we have a reasonable expectation of an
> allocation failing and the ability to actually do something
> meaningful: loading images perhaps, where we could swapout
> other images, we can (and do) call the allocation ourself
> and handle return values.
> * IMHO for everything else, we just waste 2% of our size, and
> some CPU cost for no good reason at all, and doubly so for
> the case of OUString("foo") const-strings that we allocate
> internally that cannot be that large.
> Ergo, I suggest we discuss this at the ESC tomorrow, with a view to a
> pragmatic, top-to-bottom solution that actually has a chance of working
> I've added it to the agenda.
Ah, I should have read this before the call. ;)
More information about the LibreOffice