Clean up OSL_ASSERT, DBG_ASSERT, etc.

Fri Jan 16 07:42:30 PST 2015

On Fri, Jan 16, 2015 at 9:14 AM, Stephan Bergmann <sbergman at redhat.com> wrote:
> On 01/16/2015 01:26 PM, Ashod Nakashian wrote:
> [...]
>>
>> There are numerous different assertion functions/macros used in the
>> Libreoffice codebase, each with its own behavior and implementation.
>> This is problematic in its own right, as there is no obvious way to
>> know what to expect when a certain precondition is broken (by new
>
>
> What do you mean with "what to expect" here?

As in "would a piece of code abort or just log?" Of course we can find
out what individual macros do, but the code mixes logging and aborting
asserts, which means if it doesn't abort doesn't mean that no
invariant is violated.

>
>> code, modified code, or by a new external input,) which is vital for
>
>
> Not sure I understand you here, but note that input data in itself should
> never be the reason for a failed precondition.  A failed precondition is
> always an indicator of a programming error.

Assertions are not limited to preconditions. They check invariants.
Meaning, assumptions either in the algorithm and/or data that should
not change (without corresponding changes to mitigate them).
As such, assertions can and should be used to validate said assumptions.
My point above that if you are testing some feature, should you expect
said violations to abort or log? Again, the ambiguity is in that there
is inconsistency in the usage of the word 'assert' in the existing
macros.

>
> [...]
>>
>> A quick grep returns the following assertion functions (those
>> containing case-insensitive "assert"): assert, DBG_ASSERT, OSL_ASSERT,
>> DBG_BF_ASSERT. This is excluding CPPUNIT_ASSERT and assertions in
>> 3rdparty libraries (which are out of scope).
>
>
> History shows that moving from the legacy zoo of ill-specified,
> inconsistently-used functionality to a supposedly sane set is by no means
> trivial and is taking a long long time to accomplish.
>
> It is the inconsistent usage of the legacy functionality that makes it hard
> to properly map to one of the three to four advocated mechanisms du jour
> (static_assert, assert, SAL_LOG_IF, SAL_WARN_IF), and that apparently makes
> easy hackers shy away from it.

Agreed on all points.

>
>> I know there has been a push to use the system assert, which at least
>> on Windows aborts, which has the nice side-effect that it can't be
>
>
> It aborts on all platforms (unless disabled with NDEBUG, of course).

Better than silent logging, but not the most useful to a developer. (See below.)

>
> [...]
>>
>> 1) Static assertion: Compile-time assertion that fails the build if
>> violated, otherwise compiles into nop in all configurations.
>> LO_ASSERT_STATIC(EXP);
>
>
> I see no real benefit in wrapping a macro around C++11 static_assert.

Do we not have C code? Also, we might have to support a non-conforming
compiler in some port (at a future date).
This was for completeness sake, not to be redundant and to support C code.

>
>> 2) Permanent assertion: Runtime assertion that is never compiled out
>> of the code, even in releases (see below). These are very cheap, but
>> critical, checks. Typically trivial bounds or invariant checks. The
>> overhead of such an assertion should be <5% (typically 2%) of the
>> function they are in.
>> LO_ASSERT_REL(EXP);
>>
>> 3) Debug assertion: Runtime assertion that validates pre- and
>> post-conditions, including some checks that require some nominal
>> amount of computation. These are the typical debug assertions and
>> should have an overhead of 5-20%. Enabled for non-Release builds only.
>> LO_ASSERT(EXP);
>>
>> 4) Sanity/Validation assertion: Extensive runtime assertions that
>> validate an algorithm by executing extensive checks and
>> recomputations. These checks have a too high a cost to be enabled in a
>> build used by even the wider development community. They are best
>> enabled when debugging a certain algorithm or tracking down a
>> particularly nasty issue. The overhead of these assertions are >20%.
>> Enabled only when LO_USE_ASSERT_CHECKS is defined.
>> LO_ASSERT_DEBUG(EXP);
>
>
> I am not sure we would manage to maintain a useful discrimination of
> assert-uses based on their (perceived) runtime impact.
>
> (We have OSL_DEBUG_LEVEL to disable---and probably let bit-rot---the
> occasional assertion code that would be prohibitively expensive in general,
> and that appears to work, kind of.)

The ability to do so is an advantage. We don't have to use it. (The
"higher cost" version actually is only enabled when OSL_DEBUG_LEVEL >
1. This is useful for the costly checks in common code, such as
sorted_vector checking if the data is sorted on every access, that is
only useful in certain cases, but removing the assertions would be
wasteful when one needs them in the future. You can surely think of
other cases that can benefit form this type of selection.)

But you don't comment on the more important points!

The above will allow us to have assertions in release/production
builds -- we won't be limited to only 2 options: assert enabled or
disabled. The behavior will be fully controllable (at the very least
we'd like to log and dump stacks for our benefit, at worse we can
core-dump).
In addition, we'll be able to break into the debugger (or help the dev
to do so) on *all* platforms, and do so consistently and with the
control of the developer in question.

Aborting is a good start (it makes noise where noise is valuable). But
for tracking down issues, one needs more, and not just breaking into a
debugger.
The above macros all support formatting arbitrary messages with
arbitrary arguments[*]. This allows developers to dump variables and
add useful feedback to the developer maintaining the project,
modifying some code, or tracking a new-found issue in an old codebase.
There is support for invoking the system assert, btw.
Also, not less importantly, often one needs to go past some asserts to
see how the algorithm behaves and how the data evolves. Aborting until
one can deal with the issue is not helpful in finding or solving the
issue at all, which is what I'm concerned with a developer. (Suppose
the assert is no longer valid, or we'd like to run the algorithm
further to debug the issue.)

As a developer, I want to have the assertions to be self-documenting,
allow me to hook a debugger, or just let them log for my inspection
later, allow me to disable them selectively (the 3 levels, including a
release-mode one,) allow me to decide how they behave and even write
my own handler that could dump other useful data (say the state of an
allocator/pool or some shared object).

I know some significant work has already been done in cleaning up
assertions. This in no way invalidates them at all.
In fact, changing assert to LO_ASSERT is trivial. The effort to
convert the OSL_ and DBG_ ones remains the same (with or without the
above proposal).

However, LO_ASSERT will support the above advantages and give full
control over assertions in the project. The cost is very low (surely
searching and replacing assert with LO_ASSERT is not a high cost). The
variants will allow more fine-grained control and will make the
behavior more homogeneous.

I hope you agree that this is an improvement over, and not a
competition with, the current effort to replace OSL_ and DBG_ versions
with assert.

[*] Example: LO_ASSERT(limit > size, "Blob's size (%d) must always be
smaller than the limit (%d).", size, limit);

>
> [...]
>>
>> 1) FailLog: Prints to stderr and a log file (optional). (Ideally will
>> include the stack-trace as well.)
>
>
> What is the benefit of not aborting the out-of-control process immediately?
>
> [...]
>>
>> 3) FailBreak: Calls FailLog then breaks into the debugger (typically
>> by issuing 'int 3' on x86/x64). (This might not be useful on
>> non-Windows platforms.)
>
>
> My understanding is that Windows alrady offers this "break into the
> debugger" for plain abort?
>
>> 5) FailThrow: Calls FailLog then throws std::logic_error. (This
>> assumes there are handlers, or a debugger is set to break on C++
>> exceptions.)
>
>
> std::logic_error implying a programming error, I see no better approach
> anyway for LO than to abort in the event of a thrown logic_error.
>
>> Once the support code is created, this is solidly an easy hack, one
>> that beginners should be encouraged to work on.
>
>
> see above
> _______________________________________________
> LibreOffice mailing list
> LibreOffice at lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/libreoffice