[Intel-gfx] [QUERY] How many CI mails is too many?
Chris Wilson
chris at chris-wilson.co.uk
Wed Nov 29 09:48:09 UTC 2017
Quoting Chris Wilson (2017-11-28 10:16:17)
> Quoting Daniel Vetter (2017-11-28 10:08:56)
> > On Tue, Nov 28, 2017 at 11:06 AM, Chris Wilson <chris at chris-wilson.co.uk> wrote:
> > > Quoting Joonas Lahtinen (2017-11-28 08:15:13)
> > >> On Mon, 2017-11-27 at 16:54 +0200, Arkadiusz Hiler wrote:
> > >> > Hey all,
> > >> >
> > >> > For some time already CI sends out 1-2 mails per series per (re)run, i.e. BAT
> > >> > results and "full IGT" results (if BAT has not failed).
> > >> >
> > >> > Recently we have added 32bit build check, and if that fails it sends out
> > >> > additional mail In-Reply-To the series.
> > >> >
> > >> > I am working on adding some static checks to the CI (spare and checkpatch at the
> > >> > moment, more may come in the future), which may generate even more commotion on
> > >> > the mailing list.
> > >> >
> > >> > How much of CI noise is too much and how you would like to have the results
> > >> > grouped?
> > >> >
> > >> > Couple of options to start the discussion:
> > >> >
> > >> > 1. Group all static checks (and the 32bit build?) into one mail:
> > >> > - just one additional mail,
> > >> > - may be hard to read in case of catastrophic failure,
> > >> > - we can send it only when something actually fails.
> > >> >
> > >> > 2. Send out the results as a part of BAT results:
> > >> > - even less noise than (1),
> > >> > - BAT results already feel cluttered, this may decrease readability.
> > >> >
> > >> > 3. Have each check as a separate mail, but send it only if the check fails:
> > >> > - noisy: may result in many mails, depending how many checks fail,
> > >> > - easier to read and easier to follow on patchwork.
> > >>
> > >> The best user experience I could think of;
> > >>
> > >> 1. If all CI checks succeed, delay and only send one mail with all the
> > >> results. This would indicate it's good to merge, go do it.
> > >> 2. When a CI checks fail, immediately send that out so the developer
> > >> gets to work on the fix.
> > >>
> > >> Above requires that all the checks complete rather quickly and a trust
> > >> is gained to the system so that the absence of e-mail always means the
> > >> series is doing good, not that the system is clogged in some way :)
> > >
> > > Or just 2. The first being the compilation report; saying we
> > > have received your patch and it compiles fine, it will be queued to the
> > > farm currently in slot N (or it doesn't even compile!). The second being
> > > the success or failure of the CI run.
> > >
> > > From the user pov, we can't do anything until the CI report so
> > > intermediate emails saying congrats are just fluff. Useful simply to
> > > know the patch hasn't fall out of the system, but not supplying any
> > > actionable information.
> >
> > BAT was meant to be that mail, with the added benefit that if a series
> > fails the basic sanity check you can ignore it for review and
> > everything. Still not quite there yet (and the recently undone change
> > of ratelimiting didn't help).
>
> The compile check should take 30s(?) on the build host with all the
> distcc/ccache. It's going to be rare to develop a long queue and
> significant latency; whereas one developer can flood the system with 5
> different series they happen to have queued, repeat for everyone getting
> to work in the morning.
One thing I forgot to mention, is that trybot has a large latency for
that single BAT success email. Having a quick "patch received; compiles"
response from the system would give a nice bit of reassurance.
-Chris
More information about the Intel-gfx
mailing list