Checking string allocations (was Re: String literals, ASCII vs UTF-8)

Wed Feb 29 08:11:58 PST 2012

On Wednesday 29 of February 2012, Stephan Bergmann wrote:
> On 02/29/2012 03:28 PM, Lubos Lunak wrote:
> > On Wednesday 29 of February 2012, Stephan Bergmann wrote:
> >> However, there are also situations where bad input (malicious or
> >> otherwise) would cause an application to request excessive amounts of
> >> memory to do a single task (e.g., open a document), and at least in
> >> theory the application should be able to cope with such
> >> externally-induced OOM conditions, by abandoning the bad operation,
> >> cleaning up after it, telling the user the operation failed, and
> >> carrying on.
> >
> >   The problem with this theory is that it is a theory. In practice the
> > code should check the size first, and if it doesn't, then it presumably
> > will not clean up properly anyway.
>
> But checking the size beforehand is not trivial.  Where exactly to draw
> the line between legitimate but large allocation requests and bogus
> ones?

 Ok, I think you are right here. So if we want to check for this case, we need 
to check all allocations. Which is the second of the problems I listed - we 
do not.

> What about cases where the allocation is not in one big chunk, 
> but piecemeal (say, ending up in a linked list) with an excessive number
> of innocently-looking small allocations?

 They may possibly fail somewhere else than the OUString ctor - OUStringBuffer 
does not throw bad_alloc, OUString::concat() does not throw bad_alloc, etc. 
Which brings us back to my argument that the current way, having the checks 
in OUString ctors and nowhere else, does not solve much.

 The way I see it, there are these options:

- we leave it as it is. The least work, allocation failures that happen to 
occur first in OUString ctors will throw exceptions, they maybe wil get 
handled and cleaned up properly, or maybe not. Failures anywhere else will 
make LO fall flat on its face somewhere (and as you said, that somewhere in 
these cases will be the following line, so the actual problem will be only 
figuring out the cause was allocation failure and not a corruption 
elsewhere).

- we remove the throwing of std::bad_alloc. It will remove the invalid idea 
that we actually can recover from allocation failures. Failures will end up 
the same way like above.

- we add checking and std::bad_alloc throwing, and catching it, and cleaning 
up properly, and whatnot, everywhere. Or we can just go fixing something that 
is actually likely to happen.

- we replace the OSL_ENSURE usage in rtl_ustr* functions by CHECK_ALLOCATION 
macro, that does OSL_ENSURE, er, I mean OSL_FAIL first, and even in non-debug 
code does abort() in case that allocation fails. Or maybe even do this just 
in rtl_allocateMemory() (or whatever is the function that ultimately actually 
does the allocations), and don't bother with std::bad_alloc or anything like 
that. Do we actually have code that tries to gracefully handle running out of 
memory? Because if not, and I doubt we realistically do[*], then we might as 
well just admit this status quo and have a nice crash right after the 
allocation fails.

[*] The word realistically is important. I can see there are actually places 
catching std::bad_alloc, but that's again just false sense of safety. Check 
out e.g. sd/source/filter/ppt/propread.cxx , PropItem::Read(), that one 
handles it properly (almost, it leaks) and returns false on failure. However 
when it's called, the return value is either ignored, or the property that 
was supposed to be read is ignored. If I open a bigger file on a system that 
is short on memory, that can be a silent loss of data. In that case just 
crashing is usually the better option, because that way the user at least 
knows.

-- 
 Lubos Lunak
 l.lunak at suse.cz