[Spice-devel] [PATCH 12/22] Add exception handling classes

Thu Mar 8 09:56:55 UTC 2018

> On 7 Mar 2018, at 12:52, Frediano Ziglio <fziglio at redhat.com> wrote:
> 
>>> On 2 Mar 2018, at 18:13, Christophe Fergeau <cfergeau at redhat.com> wrote:
>>> 
>>> On Fri, Mar 02, 2018 at 02:00:23PM +0100, Christophe de Dinechin wrote:
>>>> 
>>>> 
>>>>> On 2 Mar 2018, at 09:03, Frediano Ziglio <fziglio at redhat.com> wrote:
>>>>> 
>>>>>>> 
>>>>>>> On 1 Mar 2018, at 13:13, Frediano Ziglio <fziglio at redhat.com> wrote:
>>>>>>> 
>>>>>>>> 
>>>>>>>> From: Christophe de Dinechin <dinechin at redhat.com>
>>>>>>>> 
>>>>>>>> Throwing 'runtime_error' directly should be reserved for the support
>>>>>>>> library.  Add an 'Error' class as a base class for all errors thrown
>>>>>>>> by the streaming agent, as well as subclasses used to discriminate
>>>>>>>> between categories of error.
>>>>>>>> 
>>>>>>>> The Error class offers:
>>>>>>>> - a 'format_message' member function that formats a message without
>>>>>>>> allocating memory, so that there is no risk of throwing another
>>>>>>>> exception at throw time.
>>>>>>>> 
>>>>>>> 
>>>>>>> why not formatting the message constructing the object so before the
>>>>>>> throw?
>>>>>> 
>>>>>> That requires dynamic allocation, which is entirely unnecessary and
>>>>>> potentially dangerous (replacing one exception kind with another).
>>>>>> 
>>>>> 
>>>>> As explained by my string.c_str example your interface is prone to
>>>>> errors.
>>>> 
>>>> The interface is not “prone to errors” at all. The scenario you
>>>> describe makes no sense, since the whole interface is designed to
>>>> avoid dynamic strings. This is now explicitlyi documented (see
>>>> https://github.com/c3d/spice-streaming-agent/blob/c%2B%2B-refactoring/include/spice-streaming-agent/errors.hpp,
>>>> not yet shared, I’m not entirely done with the reviews). The
>>>> constructor comment now says:
>>>> 
>>>> 	Constructor takes the base message returned by what()
>>>> 	The message must remain valid over the whole lifetime of the exception.
>>>> 	It is recommended to only use static C strings, since formatting of any
>>>> 	dynamic argument is done by 'format_message'
>>> 
>>> This behaviour seems inconsistent with std::runtime_error(const char *)
>>> which as far as I can tell does not have this lifetime expectation.
>> 
>> You are correct, but please note that std::runtime_error takes a string as
>> input argument, making it very clear that you are dealing with a ‘string’
>> underneath, not a const char*.
>> 
>>> 
>>>> 
>>>> Anyway, your example is misleading. Anybody that passes the result of
>>>> string::c_str() to a const char * without proper consideration for the
>>>> lifetime is at fault. The const char * interface itself cannot be
>>>> faulted for that.
>>> 
>>> For what it's worth, I usually expect const char * interfaces to make
>>> their own copy of the const char * arg if they need it after they
>>> return, and I've rarely seen exceptions to this expectation.
>> 
>> std::vector and all standard library containers? You do a push_back of a
>> const char * in an std::vector<string>, you get a copy. If you do that in an
>> std::vector<const char *>, you don’t.
>> 
>> In C++, raw pointers, including const char *, always have the explicit
>> meaning that you manage the lifetime. So if you have “rarely seen
>> exceptions”, you did not look hard enough IMHO ;-)
>> 
>> Let me give an example from *our own code*. ConcreteConfigureOption takes two
>> const char * arguments. Does it make copies? No. Same thing with the related
>> AddOption method. Anything particularly surprising there? I don’t think so.
>> 
>> Of course, we could switch to “strings everywhere” (which Lukas suggested at
>> a point). But that’s wrong too, it causes unnecessary allocations and code
>> bloat (see below what I’m talking about).
>> 
>>> 
>>> If I understood properly, things are designed this way to avoid running out
>>> of memory while building the exception. I'm definitely not in favour of
>>> having an error-prone API to handle a very rare case which the rest of the
>>> code is most likely not going to be able to cope with anyway.
>> 
>> Yet another iteration of the “bad_alloc” theory. You know that this is
>> starting to get on my nerves? As stated *REPEATEDLY* before, bad_alloc is
>> only one of the reasons for doing things that way.
>> 
>> To make this clear, let me elaborate a little on *one* other reason (I listed
>> a few others), code bloat and runtime cost. Let’s say we want to store a
>> write error with an errno. The two approaches under consideration are:
>> 
>> 	A. throw WriteError(message, operation, errno);
>> 	B. throw std::runtime_error(message + “ in “ + operation + “: “ +
>> 	strerror(errno))
>> 
>> I’ll briefly mention, as a gentle reminder of some of the “other reasons”,
>> that (A) is
>> 1. shorter to type
>> 2. easier toi read
>> 3. less error prone
>> 4. more likely to guarantee consistent messages
>> 
>> and I will instead focus on the code being generated in both cases.
>> 
>> In case A, the code is roughly equivalent to:
>> 
>> 	tmp = WriteError_constructor(message, operation, errno)
>> 	runtime_cxa_throw(tmp)
>> 
>> So roughy two calls, passing a total of four register args, no exception
>> landing pad.
>> 
>> In case B, the code is roughly equivalent to (where LPn is a “landing pad”, a
>> piece of code that cleans up if an exception is thrown):
>> 
>> 	tmp1 = string_constructor(message);		// LP1
>> 	tmp2 = string_constructor(“ in “);			// LP2
>> 	tmp3 = string_operator_plus(tmp1, tmp2);	// LP3
>> 	tmp4 = string_constructor(operation);		// LP4
>> 	tmp5 = string_operator_plus(tmp3, tmp4);	// LP5
>> 	tmp6 = strerror(errno);
>> 	tmp7 = string_constructor(tmp6);			// LP6
>> 	tmp8 = string_operator_plus(tmp5, tmp7);	// LP7
>> 	tmp9 = runtime_error_constructor(tmp8);	// LP8
>> 	string_delete(tmp8);
>> 	string_delete(tmp7);
>> 	string_delete(tmp5);
>> 	string_delete(tmp4);
>> 	string_delete(tmp3);
>> 	string_delete(tmp2);
>> 	string_delete(tmp1);
>> 	runtime_cxa_throw(tmp9)
>> 
>> LP8:	string_delete(tmp8);	// Chain to next landing pad
>> LP7: string_delete(tmp7);
>> LP6: string_delete(tmp6);
>> LP5: string_delete(tmp5);
>> LP4: string_delete(tmp4);
>> LP3: string_delete(tmp3);
>> LP2: string_delete(tmp2);
>> LP1: string_delete(tmp1);
>> 	runtime_rethrow();	// (__UnwindResume) to propagate bad_alloc
>> 
>> I omitted a few details for clarity, like the exception handling tables, etc.
>> But the gist of it is that now *at each throw point* you have something like
>> 25+ calls being generated (some are inlined), passing a total of at least 25
>> register args…
>> 
>> Well, 10+ times worse? That’s bad enough. But it gets better.
>> 
>> Let’s consider runtime cost now. What do these calls do?
>> 
>> In case A, three word-size copies in the constructor of WriteError, and I
>> think that’s about it.
>> 
>> In case B, each of the strings allocates memory and copies the string in it.
>> Let’s be optimistic and say that it’s one call per allocation, and that you
>> get it from a free list (let’s say something like 5 loads and three stores,
>> that’s rather optimistic). So now we have another 7 calls, and well over 50
>> load/store instructions just for the memory allocation itself. And then the
>> base message is copied at least four times. Cache, cache, cache, buy me more
>> cache!
>> 
>> In short, using B looks innocuous, but we replaced, roughly
>> 	A. two calls, four register args, three memory copies
>> with, at runtime,
>> 	B. At least 25 calls, at least 25 register args, and four memory copies of
>> 	the original message.
>> 
>> So here too, at least an order of magnitude worse, probably closer to two.
>> 
>> Guess what. After writing all this, I realized I had forgotten one
>> concatenation for “: “, so that all my numbers are understimated… Oh well…
>> 
>> 
>> Meta note: How long do you think it took me to write this reply, in a
>> probably futile attempt to explain that there is more to it than bad_alloc?
>> I’m sorry, but this is really grinding. Now, that is not an entire waste of
>> time if you take the time to really understand what I explained, and if I
>> get the benefit that you never mention bad_alloc ever again ;-)
>> 
>> Otherwise, if you only see that as some kind of “justification”, my time is
>> wasted, so is yours, and everybody walks out of it unhappy.
>> 
>> 
>>> 
>>> 
>> 
>>> Christophe
>> 
>> 
> 
> An improvement to the declaration to avoid some issues could be
> 
> template <std::size_t N>
> Error(const char (&msg)[N]) : ...
> {
>  …
> }

I actually considered doing something like that,  but I find it unnecessarily ugly.

> 
> but even so I find that being able to pass a dynamic string adding
> some information is really helpful.

It is. But I want to encourage people to pass that dynamic information in the exception object.

I illustrated that all over the place with IOError/WriteError/ReadError passing errno, etc.

What would be a case where you think that approach would not work?

Is the real problem that my proposal suggests you create a class for that category of dynamic information? If that’s the objection, I’d argue it’s a good thing. It forces you to find commonalities in your code, like I did for WriteError, IIOError and the likes. Among other things, it encourages the good practice of having a shared error message format.

> In many cases you want to add information to the exception and
> is quite common to catch one exception, add another informations
> and throw a more detailed exception.

It is useful, indeed. And when you need to do that, I strongly prefer something like:

catch (WriteError &we) {
	if (errno == EPERM) {
		collect_SELinuxData(&sedata);
		throw WriteErrorWithSELinuxData(we, sedata);
	}
}

which I find vastly superior to an alternative like:

catch (some_runtime_error &e) {
	if (parse_what_looking_for_errno(e.what()) == EPERM) {
		collect_SELinuxData(&sedata);
		throw some_runtime_error(e.what() + “ with SELinux information “ + format_se_linux_info_as_string(sedata));
}

> 
> Frediano