[PATCH weston v2] Refactored simple unit/integration test framework and corresponding test program.

Fri May 29 19:17:39 PDT 2015

On 05/06/2015 03:49 AM, Daniel Stone wrote:
> Hi,
> 
> In general though, as a rule - just look at the existing codebase and
> copy the style of what's already there.

Yup... tried that. A bit sparse on the info.

I might try to collect up some in a wiki page later on.

>> @@ -0,0 +1,60 @@
>> +digraph toolarch_new {
>> +    rankdir = "TB";
>> +
>> +    node[shape=record]
>> +
>> +    subgraph cluster_0 {
>> +        label = "./tests";
>> +
>> +        keyboard_test_c [label = "{keyboard-test.c|tests\l}"]
>> +        text_test_c [label = "{text-test.c|tests\l}"]
>> +        vertex_clip_test_c [label = "{vertex-clip-test.c|tests\l}"]
> 
> Hmm, does this need to be updated for every single test then? Can this
> be autogenerated?

This was mainly to give a 'sketch' of the general process. I wasn't
expecting this to be a comprehensive list of all tests. I guess it could
be if the end result would warrant that. Creating some auto-generation
might not be *too* hard.

>> --- /dev/null
>> +++ b/tools/tools_arch_old.gv
> 
> What's the difference between old and new?

In general the 'old' reflects the structure of things before any
changes, and the 'new' has things split into different layers/folders.

[SNIP]

> Yikes, so very many magic numbers. It would be great if these could be
> calculated from constants (e.g. ((1.0 * alpha_size) << alpha_shift) |
> ((0.7 * red_size) << red_shift), etc). This makes it easier to test
> 16bpp, 10bpc/32bpp, etc. If you had pixel as a structure, you could
> then have helpers to determine the correct pixel value at fill time,
> e.g.:
> struct pixel grey70 = { .a = 1.0, .r = 0.7, .g = 0.7, .b = 0.7 };
> 
> and then have the various fill*() functions normalise to an actual
> pixel value for your desired format.

OK. Got those in, however I did not actually call it "SMPTE RP 219:2002"
although those were the values I was following. Just wasn't sure on how
picky it might get to technically call it that.

>> +               double barSpan = (width * 0.75);
>> +               size_t barCount = sizeof(bars) / sizeof(bars[0]);
>> +               size_t i = 0;
> 
> No variable declarations mid-function please. The compiler should've
> warned about this ... and also, camelCasing.

And here is the most interesting point. I put declarations as close to
first use as possible as a habit since most code checking and metrics
tools will flag distance as a negative. However... I did not get the
expected compiler warnings.

I then was thinking they might show up if I explicitly set --std=c90,
but that didn't work either. All sorts of other things in the codebase
were flagged, but not any mid-function declarations.

So... I tried to spot all my uses, but may have missed some. Just let me
know if I did.

[SNIP]

>> +#ifdef __cplusplus
>> +extern "C" {
>> +#endif
> 
> Are we expecting C++ code to call this?

Yes. Other than some of the existing Wayland/weston headers using that,
I'd actually been running some of these under Google Test and Boost Test
to get them exercised a bit.

>> +       struct wl_display *display = wl_display_connect(NULL);
>> +
>> +       ZUC_EXPECT_TRUE(display != NULL);
> 
> Should this be asserting that $WAYLAND_SOCKET or $WAYLAND_DISPLAY is
> set or similar?

Especially in light of some other discussions, this might be something
to look into. However, at the moment this was just intended to be the
minimal low-level requirement.

Currently my default environment has neither of those set. This does
look like something to follow-up on, but just as an independent issue.

[SNIP]

>> diff --git a/tools/waycheck/waycheck.dox b/tools/waycheck/waycheck.dox
...
>> +/**
>> + at page waycheck
>> +
>> +A simple integration/acceptance tool to exercise Wayland compositors.
>> +
>> +Tests use @ref zunitc for their infrastructure, and most include use of a common wtst_ctx test fixture.
>> +*/
> 
> Could this not be inline as part of a header?

Standard Doxygen practice is to place items in a stand-alone .dox file
that don't pertain explicitly to items already existing in headers.
Overviews and design notes such as this are prime examples.

[SNIP]

> Again with machine-parseable output. Also, I feel this would be much
> better off as a series of macros which test the log level themselves,
> i.e. rather than if (logle > somemagicnumber) { printf() } at every
> caller, have a series of log_crit(), log_err(), log_warn(), ...
> macros. We already have precedent for this in weston_log, which also
> allows you to log to files instead.

Yes. Logging is something else I've worked on in the past, and fitting
in with existing systems is very good. That does seem to be a separate
point to me though. That is, a follow-up set of patches can remove and
unify any logging in a more comprehensive manner.

> This should really be an assertion, as there can only be one
> wl_compositor object; if we get two, we've screwed up by having
> multiple registries on the same context, and anything which was
> previously using ctx->compositor is now deep into undefined behaviour.

Ooooh, very nice additional information and now captured in assertions.

> Global advertisements don't change. They're added or removed; if we
> get multiple advertisements, it's because there are multiple - and
> separate - live objects.

More good capture.

>> +To get started, one or more tests should be defined via ZUC_TEST(), which sets up
>> +automatic test registration via gcc extensions. To actually execute tests,
>> +ZUC_RUN_TESTS() should be called.
> 
> I like this bit!

Thanks. Once I gave up and conceded to some gcc specifics it turned out
to be quite easy.

>> +Tests can use several ZUC_EXPECT_* or ZUC_ASSERT_* checks to validate
>> +conditions. The ZUC_EXPECT_* ones upon failure will mark the current test
>> +as failing, but allow execution to continue. On the other hand, the
>> +ZUC_ASSERT_* tests will mark the current test as failed and then immediately
>> +terminate the test.
> 
> What's the usecase for marked-fail but allowing execution to continue?

Very common. In fact Boost Test has a third level just for warning on
items. I was going with the group that says if it doesn't make the test
fail then we shouldn't bother warning.

One of the cases is to do cleanup on C style things. With C++ dtors will
do cleanup, but with C we need to get explicit so that tools such as
valgrind won't swamp us with noise output.

> Is REALTIME guaranteed to be monotonic?

No, but can be a close enough fallback. The main differences would be
during NTP or manual time changes.

>> +// Data structure simple, but should suffice for smaller numbers of tests.
>> +static int caseListCount = 0;
>> +static ZucCase *caseListHead = NULL;
> 
> wl_list.

Used elsewhere, but this section is at a layer below that at which
wl_list is defined. (Personally I'd like to see that refactored so that
among other things it would be easier to get some unit tests on it)

[SNIP]
>> +}
> 
> Can we not generate these functions through macros perhaps, to cut
> down on duplication a bit? Though I have to say I do really like that
> it prints out not only the condition (in C terms), but the actual live
> values. That's very useful.
> 

Done.

>> +ZUC_TEST(BaseTest, MathIsMoreHard)
>> +{
>> +#ifdef ENABLE_FAIL_TESTS
>> +       ZUC_ASSERT_EQ(5, 2 + 2);
>> +       ZUC_ASSERT_EQ(7, 9);
>> +#endif
>> +}
> 
> It's not immediately obvious to me where ENABLE_FAIL_TESTS gets
> defined, or what it does.

I'll add some more comments. If it is enabled then tests will start
failing and you'll get a bad build. Not a happy thing to have happen by
default.

> 
>> +int main( int argc, const char* argv[] )
> 
> No padded parentheses please.
> 

That was actually copied from some other weston code. Looks like I still
need to fix that.

> Like I said, this isn't a substantive review of the test suite itself,
> just a whole host of style issues that will become important as the
> code comes to actual review/submission; we keep a consistent style to
> make it easier on everyone. Most of these are cosmetic issues, but in
> general I do find the lifetime and defensive-program issues very
> concerning: I worry that in your quest to plough on and try to recover
> no matter what the circumstances, you are creating a whole world of
> undefined behaviour which will be rarely hit, but extremely confusing
> when it is.
> 
> Not only is being unpredictable about the worst thing we can do in a
> test suite, but the added complexity makes it extremely difficult to
> discern which behaviour is actually desired. How long do the Wayland
> objects live? Do they always have a tracking structure associated with
> them? Do they get destroyed with the tracking structure, or do they
> outlive them, or vice-versa? I'm not really sure from the first pass,
> and I find that pretty troubling.
> 
> I liked a distinction Havoc Pennington once drew between 'programming
> errors' and transient errors, in regards to when you should do
> checking. If you're passing in a NULL context or a NULL interface
> name, or the tracking structure has has disappeared, then that's a
> sign of something seriously troubling that you want to understand and
> fix immediately. Trying to paper over the issue, and changing the
> behaviour in the process, means we can only be _less_ certain that our
> tests are testing what they're supposed to. And that's not good. But
> things like allocation failure, etc, are expected to occur, and that's
> OK: we can return failure and punt those up the chain.

Yes, those are good considerations. However what I've seen cause
developers even more headache is when the test framework itself has
problems and they can't tell if their code is guilty or not. One of the
main cleanup passes I did was to help ensure the cases that have issues
properly note errors and then gracefully back out. Silently dropping
them is normally the problem, so we want to be sure not to be silent.

For example, if a child test crashes we can note the fact that something
went wrong, but logging of critical items might have been lost. On the
other hand if errors are logged and then the test gracefully exits with
a failure listed then developers normally can fix things quicker.

-- 
Jon A. Cruz - Senior Open Source Developer
Samsung Open Source Group
jonc at osg.samsung.com