tb3 -- tinderbox coordinator

Wed Jun 5 08:56:23 PDT 2013

Hi all,

I just pushed:

https://gerrit.libreoffice.org/#/c/4166/

with the ominous description "robust asyncronous tinderbox coodinator". Its
still WiP, but this is what it is intended to do in the end:
- Coordinate one or more tinderboxes running a build scenatio or test suite
- Distribute the work to all tinderboxes
- Without blocking, if a tinderbox does not come back with a result
- In fact, there is no discrete 'tinderbox timeout' -- if a build is scheduled
  for one scenario, the builder gives an estimation on when it comes back. This
  pushes the scheduling away from that specific area. This 'pushing away' is
  fading with time, so the longer the tinderbox takes, the less the coodinator
  will divert others from that commit area.
- tb3 does building of new commits as well as bisection, using scoring. That
  is: If the branch is broken on a build/test scenario, it will try to bisect
  to the commit that introduced the trouble, but it will also keep an eye on new
  commits. As a rule of thumb: If there are more new commits on the head than
  remaining to be bisected, it will schedule a head commit.
- the scoring of the bisection or head can be adjusted by the builder

I want to put this thing on a Jenkins instance, where the builder will be able
to interact easily with it via the REST API:

 https://wiki.jenkins-ci.org/display/JENKINS/Remote+access+API

this should allow the build clients to be dumb, flexible in configuration and
coordinated, and also allow the display of current state of a build or test
scenario there.

A usual scenario would look like this:
1/ a tinderbox picks suggestions what to build with tb3-show-proposals
2/ the tinderbox notifies tb3 with tb3-set-commit-running to keep other
   builders out of that area
3/ the tinderbox reports its experience with tb3-set-commit-finished (success
   or failure)

Currently the state of the repo is oldschool ASCII-only, here is an example:
d2815bc4ea20346f882203112e3edba1cb65089e started on 2013-06-05 17:11:53.669911 with builder testbuilder and finished on 2013-06-05 17:11:54.327493 -- artifacts at foo, state: GOOD (took 0:00:00.657582)
6b7f0cbe2e906bdea6ed03ca11fe5bfb85689598 started on None with builder None and finished on None -- artifacts at None, state: POSSIBLY_FIXING
1bb49ab50b4a6163f24dd12f419fd7e12e96b811 started on 2013-06-05 17:11:53.658136 with builder testbuilder and finished on 2013-06-05 17:11:54.003934 -- artifacts at foo, state: BREAKING (took 0:00:00.345798)
57c33ea1aabd80923f3376147209b00b6fd22117 started on None with builder None and finished on None -- artifacts at None, state: POSSIBLY_BREAKING
d03a3c72f045bc33f0b567aa046f133d49c8e7ba started on 2013-06-05 17:11:53.646604 with builder testbuilder and finished on 2013-06-05 17:11:53.764743 -- artifacts at foo, state: GOOD (took 0:00:00.118139)
270875ec824dbaa547e7c564dc50e60134206bf6 started on None with builder None and finished on None -- artifacts at None, state: ASSUMED_GOOD
3e62f0111dda629d227c840e5d961ad25b62006f started on None with builder None and finished on None -- artifacts at None, state: ASSUMED_GOOD
650e7290474ab86c36afba98d2e9cfac72d226f6 started on None with builder None and finished on None -- artifacts at None, state: ASSUMED_GOOD
82f3b5fe1ce00e708546f37cd52aa7687521f392 started on None with builder None and finished on None -- artifacts at None, state: ASSUMED_GOOD

the states of a commit are:
GOOD              a builder was happy with it
BAD               a builder was unhappy with it
ASSUMED_GOOD      not tested, by the last build before and the first build after are good.
ASSUMED_BAD       not tested, by the last build before and the first build after are bad.
POSSIBLY_BREAKING not tested, but the last build before was ok and first build after was not.
POSSIBLY_FIXING   not tested, but the last build before was bad and first build after was not.
BREAKING          a builder was unhappy with it, but the direct parent was ok
RUNNING           a builder claims to be testing this one right now
UNKNOWN           duh

Since all of this is exportable as JSON, some nicer visualization should be
possible later. Note that the storage is completely in git itself (via git
notes), so it should also be possible to distribute that data rather easily if
needed.

With the scoring logic, we can extend this to add other things to build than
just churning away on the head of a branch -- namely verification on gerrit.
Thats not implemented yet, but the logic is pretty obvious: give a score
depending on gerrit build-queue length, and the tinderboxes will care about
gerrit when needed, just as well as about master.

Finally note that this should not only be useful for our current tinderbox
needs. Esp. with the bisection, this is helpful in a lot of scenarios that
where too slow for tinderboxing up to now, e.g. ARM builds, the
bugzilla-document tester, subsequentchecks, any type of unittest. And we can
reduce a lot of tinderbox spammage, by e.g. starting to spam only when the
thing is bisected below a certain threshold, preventing the "one of you (200)
people broke something".

Eh, this text got a lot longer that originally intended. Anyway please feel
free to play with it, its still notoriously underdocumented, but has almost as
much tests as production code, so maybe those can give you an idea (you can run
those tests with a simple call to "make").

Best,

Bjoern