[Libreoffice] Assertions and Logging
Stephan Bergmann
sbergman at redhat.com
Fri Nov 18 06:25:16 PST 2011
Hi all,
The current situation with assertions and logging of "interesting
events" in the LO code base is not very satisfying.
For one, there are two sets of functionality that cater for the same
needs. One is osl/diagnose.h, the other is tools/debug.hxx.
For another, the distinction between true assertions (that assert
invariants of the program code, and that indicate programming errors
when not met) and the warning about unusual program states (that still
need to be handled properly by the code, like the occurrence of
malformed input) is not clear-cut with the existing macros. Some code
is careful to use OSL_ASSERT only for true assertions and OSL_TRACE for
warnings about unusual program states, while other cdoee uses OSL_ASSERT
for both cases.
The downside of that mixture is that the useful debugging technique of
aborting upon detection of a violated invariant is not available.---If
you make OSL_ASSERT (and OSL_ENSURE, OSL_FAIL, etc.) abort, it will
abort far too often for mundane warnings for it to be useful.
And for a third, the combinatorial space of --enable-debug,
--enable-dbgutil, OSL_DEBUG_LEVEL, DBG_UTIL, etc. is rather confusing.
While --enable-debug (prepare the build for easy debugging, by letting
the compiler emit extra debug information and disable optimizations,
etc.) and --enable-dbgutil (enable additional checks in the code base)
ought to be rather orthogonal, in some sense they ultimately converge on
the same OSL_DEBUG_LEVEL macro, trying to control disparate needs with a
mere three values, 0, 1, 2. (--enable-debug and --enable-dbgutil both
set OSL_DEBUG_LEVEL to 1; setting it to 2 requires passing a dbglevel=2
switch to make; additionally, --enable-dbgutil defines the DBG_UTIL
macro.) Also, confusion keeps arising as to whether enabling various
debugging mechanisms may render the code for which they have been
enabled incompatible (i.e., whether they have to be enabled for the
complete build, or can be enabled selectively). The general idea is
that --enable-debug does not cause incompatibility, while
--enable-dbgutil potentially does (and even uses another $INPATH,
removing the ".pro" extension).
That being the status quo, an analysis of what would actually be useful
to have brings us to the following three points:
1 There is demand for "true" assertions (that identify errors in
program logic and abort). I suggest to use standard <assert.h> for
them. It is available everywhere (in every module, in C as well as
C++), and no home-brown solution is needed. It might be considered a
disadvantage that assert does not allow to supply an additional error
message. However, as a failed assert promptly aborts program execution
with an appropriate message (supplied by the compiler, containing file
and line information), the missing message is not much of a
nuisance---you will need to look at the specified program location right
away anyway, to debug the problem. (On Unix-like OSs, assert generally
outputs to stderr; for Windows GUI applications, it outputs to a message
box and allows attaching a debugger, see the MSDN documentation for
"assert" at
<http://msdn.microsoft.com/en-us/library/9sb57dw4%28v=vs.71%29.aspx>.)
Whether assertions are enabled (and abort when not met) is ontrolled via
the standard NDEBUG macro. It is already taken care of in our build
environment, being left undefined (i.e., assertions are enabled) for
--enable-dbgutil and for --enable-debug. (In the old build system, it
is only left undefined for --enable-dbgutil, and always defined---thus
disabling assertions---otherwise.) As enabling assertions should not
affect compatibility, enabling them for booth --enable-dbgutil and
--enable-debug appears appropriate. (But we can leave the behaviour of
the old build system as it is, given that it is going away, anyway.)
2 There is also demand for logging of "interesting events" during code
execution. For this, I propose new functionality below. Whether or not
logging is enabled should not affect compatibility.
The rationale for replacing the current mechanisms for both (1) and (2)
with something else each, is to allow for a smooth transition. Existing
occurrences of OSL_ASSERT etc. can be inspected individually, replacing
them either with assert (see above) or SAL_WARN (see below). If we
would instead keep OSL_ASSERT for one of those two cases, it would be
unclear for an occurrences of OSL_ASSERT in the code whether it has
already been visited and reclassified or not.
3 Furthermore, there is sometimes demand for additional, debug-only
code. In general, that would be safety-check additions that are
considered too expensive to be included in production builds (e.g., code
that iterates over a data structure to check invariants, or additional,
redundant fields within data structures). Enabling such additional code
potentially affects compatibility.
Such additional code is currently controlled via OSL_DEBUG_LEVEL et al.
OSL_DEBUG_LEVEL==1 is generally used for additions that do not affect
compatibility (as it is enabled by both --enable-debug and
--enable-dbgutil). OSL_DEBUG_LEVEL==2, DBG_UTIL (defined upon
--enable-dbguitl) and privately invented defines in certain parts of the
code (to be set manually by knowledgeable developers) are used for
additions that affect compatibility or that are considered too specific
for general inclusion. Either because they are too expensive even for
every --enable-dbgutil build, or because they produce excessive log
information (for which case the below new log functionality offers a
better solution).
This can probably be reduced to three cases:
#if OSL_DEBUG_LEVEL != 0 for additional code that does not cause
incompatibilities (if there is still demand for such code; the new log
functionality will remove the need for such code in many cases). This
effectively reduces OSL_DEBUG_LEVEL to a binary switch (so the make
dbglevel=x feature can be removed; and OSL_DEBUG_LEVEL could potentially
be renamed---but that would probably not be worth it).
#if defined DBG_UTIL for additional code that causes incompatibilities.
#if defined MY_SPECIAL_DEBUG (replaced with actual defines, varying
across the different code modules) for those special cases where always
enabling the additional code is deemed to expensive in general.
(However, for those special cases where the additional code produces
excess log information, see below.)
Which brings us to the new log functionality.
Attached is a patch against current (towards LO 3.5) master (actually
two patches, an extra one for binfilter). It introduces sal/log.h and
makes osl/diagnose.h and tools/debug.h divert to it internally. It also
replaces a few example occurrences of the old osl/diagnoes.h and
tools/debug.h macros with their new counterparts (and I encourage you to
take a look at those replacements, to get a feel for what is possible
with the new functionality).
For easy reference, I copy the relevant documentation from sal/log.h here:
> SAL_INFO(char const * area, char const * format, ...),
> SAL_INFO_IF(bool condition, char const * area, char const * format, ...),
> SAL_WARN(char const * area, char const * format, ...), and
> SAL_WARN_IF(bool condition, char const * area, char const * format, ...)
> produce an info resp. warning log entry with a printf-style message. The
> given format argument and any following arguments must be so that that
> sequence of arguments would be appropriate for a call to printf.
>
> SAL_INFO_S(char const * area, expr),
> SAL_INFO_IF_S(bool condition, char const * area, expr),
> SAL_WARN_S(char const * area, expr), and
> SAL_WARN_IF_S(bool condition, char const * area, expr) produce an info resp.
> warning log entry with a message produced by piping items into a C++
> std::ostringstream (and are only available in C++). The given expr must be
> so that the full expression "stream << expr" is valid, where stream is a
> variable of type std::ostringstream.
>
> SAL_INFO_S("foo", "string " << s << " of length " << n)
>
> would be an example of such a call; if the given s is of type rtl::OUString,
>
> #include "rtl/oustringostreaminserter.hxx"
>
> would make sure that an appropriate operator << is available.
>
> For the _IF variants, log output is only generated if the given condition is
> true (in addition to the other conditions that have to be met).
>
> For all these macros, the given area argument must be non-null and must
> match the regular expression
>
> <area> ::= <subarea>("."<subarea>)*
>
> with
>
> <subarea> ::= [0-9a-z]+
>
> Whether these macros generate any log output is controlled in a two-stage
> process.
>
> First, at compile time the macro SAL_LOG_LEVEL controls whether these macros
> expand to actual code, or to no-ops. SAL_LOG_LEVEL must expand to an
> integral value 0, 1, or 2.
>
> If SAL_LOG_LEVEL is 0, neither the INFO nor the WARN macros produce code.
> If SAL_LOG_LEVEL is 1, only the WARN macros produce code. If SAL_LOG_LEVEL
> is 2, both the INFO and the WARN macros produce code.
>
> Second, at runtime the environment variable SAL_LOG further limits which
> macro calls actually generate log output. The environment varialbe SAL_LOG
> must either be unset or must match the regular expression
>
> <env> ::= <switch>*
>
> with
>
> <switch> ::= <sense><level><area>?
> <sense> ::= "+"|"-"
> <level> ::= "INFO"|"WARN"
>
> If the environment variable is unset, "+WARN" is used instead (which results
> in all warnings being output but no infos). If the given value does not
> match the regular expression, "+INFO+WARN" is used instead (which in turn
> results in everything being output).
>
> A given macro call's level (INFO or WARN) and area is matched against the
> given switches as follows: Only those switches for which the level matches
> the given level and for which the area is a prefix (including both empty and
> full prefixes) of the given area are considered. Log output is generated if
> and only if among the longest such switches (if any), there is at least one
> that has a sense of "+". (That is, if both +WARN.foo and -WARN.foo are
> present, +WARN.foo wins.)
>
> For example, if SAL_LOG is "+INFO-INFO.foo+INFO.foo.bar", then calls like
> SAL_INFO("foo.bar", ...), SAL_INFO("foo.bar.baz", ...), or
> SAL_INFO("other", ...) generate output, while calls like
> SAL_INFO("foo", ...) or SAL_INFO("foo.barzzz", ...) do not.
>
> The generated log output consists of the given level ("info" or "warn"), the
> given area, the process ID, the thread ID, the source file, and the source
> line number, each followed by a colon, followed by a space, the given
> message, and a newline. The given message should contain no vertical
> formatting characters and no null characters. The precise format of the log
> output is subject to change. The log output is printed to stderr.
Some further points:
- Whether or not both SAL_INFO and SAL_WARN produce actual code is
currently effectively controlled via OSL_DEBUG_LEVEL: With either
--enable-dbgutil or --enable-debug, the full log functionality is
enabled. Whether or not a more fine-grained control (producing code for
only SAL_WARN) is necessary needs to be seen. (If not, SAL_LOG_LEVEL
can be folded into OSL_DEBUG_LEVEL.) Having SAL_INFO compiled into each
--enable-dbgutil build makes it possible to selectively enable
interesting log information via SAL_LOG=+INFO.foo without having to
recompile code.
- Assigning area codes to the macro invocations, I have mostly stuck to
single-segment area codes that match the names of the code modules
("sal", "binfilter", "sfx2"). For the replacements within
osl/diangose.h and tools/debug.hxx (that will in turn be called from
various places, so it would not make much sense to attribute them to
"sal" or "tools", respectively) I used "legacy.osl" and "legacy.tools",
respectively. For a few cases where old code was OSL_DEBUG_LEVEL>=2
conditional, I used two-segment area codes "canvas.level2", "jfw.level2"
(and "jfw.level1"), "sw.level2".
- Replacing all existing calls to deprecated osl/diagnose.h and
tools/debug.hxx macros will need to happen over time (an easy hack).
- There is further functionality in tools/debug.hxx (DBG_MEMTEST,
DBG_CTOR, etc.) that has not yet been addressed.
- The recently introduced OSL_FORMAT is superseded by SAL_STREAM (which
uses C++-stream-style composition instead of C-printf-format-style).
OSL_FORMAT and all its uses have been removed.
- The implementation is somewhat careful to produce as little code as
possible at the call-site of the new macros, and to keep the code path
for suppressed logs as short as possible. However, the C++-stream-style
macros will potentially always be more expensive than the
C-printf-format-style ones. But the former are more useful, so I would
encourage you to nevertheless use them where appropriate.
- unotest/oustringostreaminserter.hxx is moved to
rtl/oustringostreaminserter.hxx (and duplications of it below unotest
are removed); it is often needed in combination with sal/log.h.
- In a few places, the patches replace occurrences of tools String with
rtl::OUString, so that streaming works. Otherwise, it is necessary to
wrap the tools String in an OUString ctor.
- sal/log.h contains a TODO to enable a GCC __attribute__((format)) on
the C funtion underneath the C-printf-format-style macros. Enabling it
would produce tons of -Werror=format from old OSL_TRACE uses. These
need to be cleaned up first (an easy hack).
One open question is which set of macros (the C-printf-format-style ones
or the C++-stream-style ones) to give the "natural" names without
additional suffix. For now, I reserve the natural names for the C-style
ones, as they are more universal (not only available in C++) and produce
smaller call-side code, especially for the simple, common case where the
message consists of just a string literal, as in SAL_INFO("foo",
"message"). However, reserving the natural names for the C++-style
macros (and giving the other ones an _F suffix, say) would be justified
by their better usability (and thus more frequent usage) and the fact
that they avoid the pitfall of having to escape any literal "%" within
the format string of a C-style macro usage---although enabling GCC's
__attribute__((format)) check would also help detect misuses like
SAL_INFO("foo", "100%").
If there is no objection, I will push the patches to master next week.
Due to removing OSL_FORMAT and osl_detail_formatString again, I would
prefer to get this still into LO 3.5.
Stephan
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: log.patch
URL: <http://lists.freedesktop.org/archives/libreoffice/attachments/20111118/2f0902d0/attachment-0001.ksh>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: log-binfilter.patch
URL: <http://lists.freedesktop.org/archives/libreoffice/attachments/20111118/2f0902d0/attachment-0001.asc>
More information about the LibreOffice
mailing list