why a master-tested branch is important and incentives for CI over no-CI (was: test infrastructure ideas appreciated ...)
Bjoern Michaelsen
bjoern.michaelsen at canonical.com
Wed Jun 10 13:40:29 PDT 2015
Hi,
On Wed, Jun 10, 2015 at 02:22:53PM -0500, Norbert Thiebaud wrote:
> We can get all that merely with git-notes I think.
> iow instead of a separate brnch, just annotate or even maiybe maintain
> a tag on master to indicate the last 'green master' in the sens you gave.
Yes, and I consider tinderboxes git-noting master commits actually one sensible
way of implementing this. Once that is done, a bot looking at all the git-notes
and pulling/forwarding a 'master-tested' branch to whatever was the last commit
known-good on all platforms with whatever level of testing we want should be a
trivial cronjob.
Of course, you can collect the data of last-known-good manually -- but pulling
from a branch is one of the most basic and simple operations of git. If we want
to enable and encourage user to use this, we should make it as accessible as
possible.
> Today the jenkins tinderbox operate like their ancestor: they jump
> around moving forward.. but not every commit get built.
> and since they are not all in sync it is hard to garantee that you
> will find a given commit that has been validated for all conf.
> _but_ with more hardware comming online, I want to move to a more
> 'bibisect build model' where _every commit get built.
Im quite concerned that this actually encourages further unhealthy behaviour:
It might encourage users to 'just push to master, if it breaks I get a bibisect
for free'. In other words it encourages sloppiness with direct untested pushes
to master.
> Linux box will have to ramp up too.. but that is usually not that much
> a problem.. cloud based stuff are fairly competitive for that need.
> So we can be more reactive with resource capacity for linux.
If we are aiming for having a commit tested on all platforms, it might make
sense to have tinderboxes only jump to the latest commit of master if there is
no commit tested on other platforms, but not yet on the platform of this
tinderbox. E.g. if a Windows tinderbox sees:
master commit: untested on all platforms
master^ commit: untested on all platforms
master^^ commit: untested on all platforms
master^^^ commit: green on windows, untested elsewhere
it would testbuild master^^^ to help gain knowledge about the newest commit
building everywhere.
The important thing about the master-tested proposal is the following:
If you base your change on the head of the master-tested branch, you should
always be able get reliable and painfree per-commit/per-branch premerge testing
of your work. If you use per-commit/per-branch premerge testing of your work,
you should ~never be responsible for breaking CI for everyone[1]. You are in a
safe sandbox, you are never annoyed by others breaking master and you never
have to race to fix stuff because your change broke master for everyone.
If you a wild buckeroo who loves the excitement, you can still pull from master
and push to master without any CI. However, when you break master doing that,
_you_ are responsible that master-tested is struck, _you_ will be in firefighting
mode and _you_ are responsible for any mess that follows from that, e.g. by
follow-up cherry-picks. You will be on your own in this and those who prefer to
have their stuff tested (who are using master-tested) are shielded from the
damage you have done. The first to break master is responsible for all
follow-ups until it is fixed again on all platforms (and thus for CI).
In this thread, we are discussing getting more tests, while we in general fail
at a much more fundamental level: namely running the tests we already have
regularly. The above two models allow you to pick your poison, but unlike now,
it shields those who run proper tests from the damages of those dont. It thus
creates a balance: if those working directly on master get too reckless, more
people will switch to just base on master-tested and gerrit and leave the
madness to those few who prefer it. As such, this balance should create a virtuous
circle improving the quality for those working on master directly and for those
working on master-trusted. That is a stark contrast to the vicious circle we
experience right now[2].
> All that being said, none of that matter if the culture does not
> follow. no amount of CI can make people care.. what set the tone is
> the core developer group, the rest of us looks around how it is done
> and emulate the behavior.
As discussed above, I think there are some technical incentives to nudge the
culture in the right direction. Having an easy-to-pull and CI-verified-only
pull branch called 'master-tested' should help setting these.
Best,
Bjoern
[1] Modulo a rare cherry-picking to master applied, but broke in a subtile way
without causing a merge-conflict.
[2] Which is that 20-50% of the time, the extra effort you put into running
premerge CI tests, you are punished and frustrated by them failing because
of preexisting problems on master. There thus is ~no incentive to use
premerge CI.
More information about the LibreOffice
mailing list