Clean up OSL_ASSERT, DBG_ASSERT, etc.

Fri Jan 16 04:26:52 PST 2015

With regards to fdo#43157 and the general state of assertions in the code,
I've made a proposal in fdo#88309 and submitted a working patch #13945
(without changing any production code).

Here is the contents of fdo#88309 for discussion purposes. Of note is that
I had known both the issues with OSL_ASSERT and the ambiguous usage of
'assertion' for logging as well as the push to move to using the system
assert where a programming error is involved. The proposal is to codify,
standardize, document, and further improve upon the above two use cases.

The proposal addresses these points and the actual implementation (in its
initial incarnation) is even richer than the following. I think we need to
gain more control over debugging and maintenance, and this proposal claims
to deliver those goals.

Asserting, when used carefully, is a very useful tool to document and
debug software.

There are numerous different assertion functions/macros used in the
Libreoffice codebase, each with its own behavior and implementation.
This is problematic in its own right, as there is no obvious way to
know what to expect when a certain precondition is broken (by new
code, modified code, or by a new external input,) which is vital for
the developer who plans to work on an issue. Furthermore, reading and
modifying the code is made more difficult by this unnecessary
variations.

This is a proposal to improve the situation by carefully designing an
assertion API that is standard to the project, cleaner in specs, and
provides more support to the development team than the current
available breed.

A quick grep returns the following assertion functions (those
containing case-insensitive "assert"): assert, DBG_ASSERT, OSL_ASSERT,
DBG_BF_ASSERT. This is excluding CPPUNIT_ASSERT and assertions in
3rdparty libraries (which are out of scope).

I know there has been a push to use the system assert, which at least
on Windows aborts, which has the nice side-effect that it can't be
ignored. However this behavior can, and indeed should, be defined by
the project-specific assert and furthermore be made configurable by
the developer at build time.

Names here are suggestive, not necessarily required to be verbatim (I
know naming can be contentious). However I do believe that a different
name should be used than the existing ones to help in tracking the
conversion progress, avoid collisions etc.

Four types of assertions:

1) Static assertion: Compile-time assertion that fails the build if
violated, otherwise compiles into nop in all configurations.
LO_ASSERT_STATIC(EXP);

2) Permanent assertion: Runtime assertion that is never compiled out
of the code, even in releases (see below). These are very cheap, but
critical, checks. Typically trivial bounds or invariant checks. The
overhead of such an assertion should be <5% (typically 2%) of the
function they are in.
LO_ASSERT_REL(EXP);

3) Debug assertion: Runtime assertion that validates pre- and
post-conditions, including some checks that require some nominal
amount of computation. These are the typical debug assertions and
should have an overhead of 5-20%. Enabled for non-Release builds only.
LO_ASSERT(EXP);

4) Sanity/Validation assertion: Extensive runtime assertions that
validate an algorithm by executing extensive checks and
recomputations. These checks have a too high a cost to be enabled in a
build used by even the wider development community. They are best
enabled when debugging a certain algorithm or tracking down a
particularly nasty issue. The overhead of these assertions are >20%.
Enabled only when LO_USE_ASSERT_CHECKS is defined.
LO_ASSERT_DEBUG(EXP);

Each type will come in a version that takes an optional second
argument which is a custom message (other versions that take variable
arguments to print upon failure may also be provided).

Failed assertions will not take any action on their own, rather, a
handler function will be called. For a start, this handler can be
defined by configure, but a better approach is to wrap it in a static
library to avoid rebuilding the complete code-base when a different
strategy is necessary to track an issue. In the latter case the static
library is rebuilt when the assertion handler is changed and only the
binaries are linked.

There will be 5 standard handler provided which can be chosen at
compile time. Developers may implement their own handlers if necessary
(possibly a future extension, not immediately necessary).

1) FailLog: Prints to stderr and a log file (optional). (Ideally will
include the stack-trace as well.)
2) FailAbort: Calls FailLog then terminate (which might invoke
CrashRep when functional).
3) FailBreak: Calls FailLog then breaks into the debugger (typically
by issuing 'int 3' on x86/x64). (This might not be useful on
non-Windows platforms.)
4) FailSpin: Calls FailLog then spins (with short sleeps) to allow for
hooking a debugger.
5) FailThrow: Calls FailLog then throws std::logic_error. (This
assumes there are handlers, or a debugger is set to break on C++
exceptions.)

When a proper CrashRep is implemented, FailLog will dump the
stack-trace as well.
LO_ASSERT_REL would nominally FailLog and potentially FailAbort (to be decided).

Once the support code is created, this is solidly an easy hack, one
that beginners should be encouraged to work on.

Some ideas are influenced by Bloomberg's BSL library (see
https://github.com/bloomberg/bde/blob/master/groups/bsl/bsls/bsls_assert.h).

Please also refer to the patch for actual code (it's rather simple).

Thanks for reading, and comments are more than welcome.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.freedesktop.org/archives/libreoffice/attachments/20150116/c98b4f1f/attachment.html>