git and Marge troubles this week

Connor Abbott cwabbott0 at gmail.com
Fri Jan 7 18:34:03 UTC 2022


On Fri, Jan 7, 2022 at 6:32 PM Emma Anholt <emma at anholt.net> wrote:
>
> On Fri, Jan 7, 2022 at 6:18 AM Connor Abbott <cwabbott0 at gmail.com> wrote:
> >
> > Unfortunately batch mode has only made it *worse* - I'm sure it's not
> > intentional, but it seems that it's still running the CI pipelines
> > individually after the batch pipeline passes and not merging them
> > right away, which completely defeats the point. See, for example,
> > !14213 which has gone through 8 cycles being batched with earlier MRs,
> > 5 of those passing only to have an earlier job in the batch spuriously
> > fail when actually merging and Marge seemingly giving up on merging it
> > (???). As I type it was "lucky" enough to be the first job in a batch
> > which passed and is currently running its pipeline and is blocked on
> > iris-whl-traces-performance (I have !14453 to disable that broken job,
> > but who knows with the Marge chaos when it's going to get merged...).
> >
> > Stepping back, I think it was a bad idea to push a "I think this might
> > help" type change like this without first carefully monitoring things
> > afterwards. An hour or so of babysitting Marge would've caught that
> > this wasn't working, and would've prevented many hours of backlog and
> > perception of general CI instability.
>
> I spent the day watching marge, like I do every day.  Looking at the
> logs, we got 0 MRs in during my work hours PST, out of about 14 or so
> marge assignments that day.  Leaving marge broken for the night would
> have been indistinguishable from the status quo, was my assessment.

Yikes, that's awful - and I know it's definitely not easy keeping
everything running!

But unfortunately it seems like the problems that day were transient,
and as I said earlier there were at least 6 MRs that succeeded that
would've been merged if they weren't batch MRs, so enabling batch mode
did wind up causing some damage compared to doing nothing. So doing it
the next day, when there was some possibility to follow up any
problems, would've been better. Not to take away from what you guys
are doing, just a lesson for next time.

Connor


More information about the mesa-dev mailing list