Checking string allocations (was Re: String literals, ASCII vs UTF-8)

Mon Mar 5 09:48:55 PST 2012

On Mon, 2012-03-05 at 16:34 +0000, Caolán McNamara wrote:
> In the general case presumably you then need to run around sticking
> throw()/nothrow onto loads of things in order to tell the compiler that
> stuff isn't going to throw exceptions, and/or disabling exceptions to
> get the compiler to do something different.

	Sure - but LTO does that propagation for us automatically, it churns
through the code adding annotations of the real properties that a
function actually provides - eg. 'pure', 'nothrow' etc. Of course, if
our most basic, low-level classes look suspicious, and could do anything
at any moment, then it can't get very far really ;-)

> Amusingly, for the trivial example

	:-) yep - unfortunately, the contrived examples don't always give a
wonderful picture, try the attached - just a bit more complicated: ie.
we actually have something to unwind, and we get the opposite, smaller
with std::nothrow.

	Having said that - there is something superficially fishy about using
the std::nothrow allocator:

	g++ -Os -DTESTNOTHROW -o /tmp/nothrow.s -S ~/a.cxx

	shows:

	.cfi_escape 0x10,0x5,0x2,0x75,0
	pushl	%ecx
	.cfi_escape 0xf,0x3,0x75,0x7c,0x6
	subl	$12, %esp
	.cfi_escape 0x2e,0x8
	pushl	$_ZSt7nothrow
	.cfi_escape 0x2e,0xc
	pushl	$4
	.cfi_escape 0x2e,0x10
	call	_ZnwjRKSt9nothrow_t
	addl	$16, %esp
	.cfi_escape 0x2e,0
	testl	%eax, %eax
	je	.L2
	movl	$0, (%eax)

	Which suggests everything is getting a little heated wrt. unwind
information, in a method which should (surely) be exception free, then
again - the code generated is as small as the same code with
-fno-exceptions and has no gcc_except_table ;-) so ... it is presumably
thrown away later.

> gives a 996 byte .o as well. Doesn't appear to be the case that gcc
> optimizes out the call to new regardless of what optimization level or
> -fno-exceptions that might let it know it could do it if it wanted.

	Too true; I guess new/free is perhaps a bad example of methods that
could be optimised away by an intelligent compiler if they were
exception free.

> Looking at std::abort vs std::throw for new, assuming something is built
> with -fexceptions, and once there's *something* in a function called
> that might throw an exception I *guess* you pay a fixed penalty and it
> doesn't really matter how many things in that function might throw
> something ?, that's the way I internalized it anyway.

	Quite probably the bulky unwind information is created for everything
regardless of what methods could possibly throw something, but of course
if we can get more methods that don't call anything that throws,
presumably we save.

> Anyway, FWIW I'm not massively interested in the allocate, "oh we don't
> have enough, lets try and release some" unrealistic path

	:-) sure.

> new Foo[foo] where foo is approx MAX_SIZE, and new should immediately
> throws without even making any effort to allocate, given that the number
> to allocate would overflow and the .wmf/.doc or whatever import is
> aborted with a toplevel import-level catch without the app exiting.

	Currently we appear to crash in this case in 99% of cases of course.
I'm not so optimistic that we can recover well enough from a
std::bad_alloc() to not horribly leak resources - anything that isn't
smart_ptr tracked, any UNO reference counting cycles / snafu's that
don't have guard classes that clean all that up as we go through the
exception chain etc.

	Currently it seems like we are catching a std::bad_alloc in remarkably
few places, and at a very local scope eg.

svtools/source/filter/filter.cxx-            try
svtools/source/filter/filter.cxx-            {
svtools/source/filter/filter.cxx-                pBuf = new sal_uInt8[ nBufSize ];
svtools/source/filter/filter.cxx-            }
svtools/source/filter/filter.cxx:            catch (const std::bad_alloc&)
svtools/source/filter/filter.cxx-            {
svtools/source/filter/filter.cxx-                nStatus = GRFILTER_TOOBIG;
svtools/source/filter/filter.cxx-            }

	I tend to agree that the idea of turning on and off exception throwing
dynamically doesn't really save us anything much - since IMHO lots of
the bloat is in the unwind information which we'd need to generate in
order to have the option to have this on for some invocations. But - the
above can at least be handled with a custom C / NULL returning
allocator, and get turned into the requisite error-code return values
in-situ - which we use.

	Of course, in -theory- we could use exception handling consistently
~everywhere instead of error codes :-) but - this wouldn't be a great
choice either since many trivial errors happen as a matter of course and
the wanton throwing of exceptions makes debugging and code-flow a total
nightmare, and performs pretty terribly too; so we're stuck with
returning error codes as well - surely.

>  If we didn't have a single-process-app none of this would matter much I
> suppose.

	I guess, though we'd have other problems I imagine :-)

> > 	I guess I now need to go and characterise how much of the saving is
> > from removing tens of thousands of in-lined 'if (baa) throw
> > std::bad_alloc();' calls vs. the exception unwind and knock-on optimiser
> > impact of having all that there.
> 
> hmm, initially I would have said that the savings must all be from the
> removal of the inlining of the "check for NULL and throw itself" given
> that the compiler can only see the body of rtl_uString2String when
> building one object file and not for any of the others, but I see now
> that rtl_uString2String is marked as nothrow so the compiler could make
> use of that I guess.

	Well, I have no idea - but I do know that ~10% of our code-size is
exception unwind information, which seems a lot to me. I should
re-generate that statistic I guess.

	If our substantial un-used code effort manages to save 1% of size - by
removing ~1% of methods as seems likely, that'd be a good outcome.

	To save nearly 2% with a fairly tiny amount of code-change seems (to
me) rather pleasant, and to make a dent into that 10% of unwind
information - almost all of which is utterly un-necessary and unused
seems like something worth doing to me.

	Then again, the size of the saving could quite possibly be around
things like the huge gobs of auto-generated code that things like oox
produce.

	All the best,

		Michael.

-- 
michael.meeks at suse.com  <><, Pseudo Engineer, itinerant idiot
-------------- next part --------------
A non-text attachment was scrubbed...
Name: a.cxx
Type: text/x-c++src
Size: 442 bytes
Desc: not available
URL: <http://lists.freedesktop.org/archives/libreoffice/attachments/20120305/17e5d2bd/attachment-0001.cxx>