Exception cost on string allocations ...

Fri Mar 2 01:14:07 PST 2012

On 02/29/2012 05:29 PM, Michael Meeks wrote:
> 	So - I did some more analysis, and re-compiled all of LibreOffice both
> with and without the attached patch - analysing all 287 shared libraries
> we get:
>
> $ du -b -c ../../without-stripped/program/*.so  | tail
> ...
> 171373684	total
> $ du -b -c ../../with-stripped/program/*.so  | tail
> ...
> 174692100	total
> $
>
> 	ie. we have as a lower bound a 1.8% size saving. I expect that to grow
> as we start to LTO - as we can finally propagate the fact that methods
> don't throw exceptions across modules, and across libraries.

I wouldn't expect too much benefit from a potentially better nothrow 
analysis.  Looking at the C++ standard library, there are lots of 
functions that can throw, not just new.  And using exceptions instead of 
error return codes can also improve performance, by removing the need 
for explicit return code checks in the code.  (Not to mention that using 
exceptions, especially ones like std::bad_alloc that are typically 
intended to just abort your application, can simplify the source code, 
something that is IMO of equal importance to raw code size/performance.)

> 	Now - the immediate, and expected riposte is that ~2% of our size is
> insignificant ( to go along with the 3% of our CPU time at startup
> parsing labels we mostly don't need ;-). Personally, I don't think the
> approach of ignoring any efficiency win as insignificant if it is less
> than 10% across a code-base of our size is sustainable in the long
> term :-) the more we ignore the more it adds up.
>
> 	So here are the facts, correct me if I'm wrong:
>
> 	* often std::bad_alloc is not handled, making the program
> 	  abort, the exception is not caught by main - there is no
> 	  autosave and data is lost - ie. a SEGV would be preferable.

That's wrong (as we already discussed elsewhere), an uncaught exception 
(leading to std::terminate -> std::abort -> SIGABRT) behaves virtually 
the same as SIGSEGV wrt our crash handling.

> 	* often we fail to emit std::bad_alloc in out of memory

Yes, our code base contains errors.  Lots of them.  ;)

> 	  conditions, cf. sal/inc/rtl/allocator.hxx etc. etc.

Cf. the comment at Allocator::allocate.  That code probably predates the 
invention of the #if defined EXCEPTIONS_OFF workaround used elsewhere.

> 	* it is unarguably the case that there is no consistent
> 	  bad_alloc handling, it is random, ill thought through and
> 	  anything less than aborting that complete document load/save/
> 	  calculate operation will almost certainly lead to (silent)
> 	  data loss / document errors.

Sure, the current code base, grown over decades and written by people 
with substantially diverging viewpoints, is a mess.  One can only hope 
that exceptions like std::bad_alloc consistently make it to 
std::terminate without "helpfully" being caught by some clueless catch 
(...) before, so that they can lead to a nice SIGABRT and following 
crash handling (see above).

So, pragmatically, reacting on rtl_allocateMemory() == 0 with std::abort 
is probably ~just as good as reacting with std::bad_alloc (as I already 
said in this thread).  What I object to is simply ignoring the return 
value of rtl_allocateMemory, and relying on some subsequent pointer 
dereferencing leading to SIGSEGV (as it complicates crash analysis, 
gives a wrong impression about our competence, etc.).

(IMO, the advantage of using std::bad_alloc over std::abort in low level 
building blocks like rtl::OUString is that it /potentially/ allows the 
call site to decide whether or not to react on it, cf. Caolán's example. 
  But, indeed, that's probably a minor point as long as our overall code 
base has more pressing problems, and one that can eventually be corrected.)

> 	* the user experience of: "we're out of memory, by the way we
> 	  prolly lost some of your document, please try to save,
> 	  re-start compare the data etc." is insignificantly less
> 	  painful than taking a crash.

Nobody suggested that OOM should lead to such a more specific user 
experience than your garden-variety crash.  (Again: the std::bad_alloc 
is intended for consumption by clueful-enough layers -- and whether 
there can be such, at least in current LO, is of course debatable.  All 
others shall let it pass, leading to SIGABRT and normal crash handling, 
see above.)

> 	* in cases where we have a reasonable expectation of an
> 	  allocation failing and the ability to actually do something
> 	  meaningful: loading images perhaps, where we could swapout
> 	  other images, we can (and do) call the allocation ourself
> 	  and handle return values.
>
> 	* IMHO for everything else, we just waste 2% of our size, and
> 	  some CPU cost for no good reason at all, and doubly so for
> 	  the case of OUString("foo") const-strings that we allocate
> 	  internally that cannot be that large.
>
> 	Ergo, I suggest we discuss this at the ESC tomorrow, with a view to a
> pragmatic, top-to-bottom solution that actually has a chance of working
> well.
>
> 	I've added it to the agenda.

Ah, I should have read this before the call.  ;)

Stephan