[Mesa-dev] [Intel-gfx] gitlab.fd.o financial situation and impact on services
Nuritzi Sanchez
nsanchez at gitlab.com
Fri Feb 28 21:37:03 UTC 2020
Hi All,
I know there's been a lot of discussion already, but I wanted to respond to
Daniel's original post.
I joined GitLab earlier this month as their new Open Source Program Manager
[1] and wanted to introduce myself here since I’ll be involved from the
GitLab side as we work together to problem-solve the financial situation
here. My role at GitLab is to help make it easier for Open Source
organizations to migrate (by helping to smooth out some of the current pain
points), and to help advocate internally for changes to the product and our
workflows to make GitLab better for Open Source orgs. We want to make sure
that our Open Source community feels supported beyond just migration. As
such, I’ll be running the GitLab Open Source Program [2].
My background is that I’m the former President and Chairperson of the GNOME
Foundation, which is one of the earliest Free Software projects to migrate
to GitLab. GNOME initially faced some limitations with the CI runner costs
too, but thanks to generous support from donors, has no longer experienced
those issues in recent times. I know there's already a working relationship
between our communities, but it could be good to examine what GNOME and KDE
have done and see if there's anything we can apply here. We've reached out
to Daniel Stone, our main contact for the freedesktop.org migration, and he
has gotten us in touch with Daniel V. and the X.Org Foundation Board to
learn more about what's already been done and what we can do next.
Please bear with me as I continue to get ramped up in my new job, but I’d
like to offer as much support as possible with this issue. We’ll be
exploring ways for GitLab to help make sure there isn’t a gap in coverage
during the time that freedesktop looks for sponsors. I know that on
GitLab’s side, supporting our Open Source user community is a priority.
Best,
Nuritzi
[1] https://about.gitlab.com/company/team/#nuritzi
[2]
https://about.gitlab.com/handbook/marketing/community-relations/opensource-program/
On Fri, Feb 28, 2020 at 1:22 PM Daniel Vetter <daniel.vetter at ffwll.ch>
wrote:
> On Fri, Feb 28, 2020 at 9:31 PM Dave Airlie <airlied at gmail.com> wrote:
> >
> > On Sat, 29 Feb 2020 at 05:34, Eric Anholt <eric at anholt.net> wrote:
> > >
> > > On Fri, Feb 28, 2020 at 12:48 AM Dave Airlie <airlied at gmail.com>
> wrote:
> > > >
> > > > On Fri, 28 Feb 2020 at 18:18, Daniel Stone <daniel at fooishbar.org>
> wrote:
> > > > >
> > > > > On Fri, 28 Feb 2020 at 03:38, Dave Airlie <airlied at gmail.com>
> wrote:
> > > > > > b) we probably need to take a large step back here.
> > > > > >
> > > > > > Look at this from a sponsor POV, why would I give X.org/fd.o
> > > > > > sponsorship money that they are just giving straight to google
> to pay
> > > > > > for hosting credits? Google are profiting in some minor way from
> these
> > > > > > hosting credits being bought by us, and I assume we aren't
> getting any
> > > > > > sort of discounts here. Having google sponsor the credits costs
> google
> > > > > > substantially less than having any other company give us money
> to do
> > > > > > it.
> > > > >
> > > > > The last I looked, Google GCP / Amazon AWS / Azure were all pretty
> > > > > comparable in terms of what you get and what you pay for them.
> > > > > Obviously providers like Packet and Digital Ocean who offer
> bare-metal
> > > > > services are cheaper, but then you need to find someone who is
> going
> > > > > to properly administer the various machines, install decent
> > > > > monitoring, make sure that more storage is provisioned when we need
> > > > > more storage (which is basically all the time), make sure that the
> > > > > hardware is maintained in decent shape (pretty sure one of the fd.o
> > > > > machines has had a drive in imminent-failure state for the last few
> > > > > months), etc.
> > > > >
> > > > > Given the size of our service, that's a much better plan (IMO) than
> > > > > relying on someone who a) isn't an admin by trade, b) has a million
> > > > > other things to do, and c) hasn't wanted to do it for the past
> several
> > > > > years. But as long as that's the resources we have, then we're
> paying
> > > > > the cloud tradeoff, where we pay more money in exchange for fewer
> > > > > problems.
> > > >
> > > > Admin for gitlab and CI is a full time role anyways. The system is
> > > > definitely not self sustaining without time being put in by you and
> > > > anholt still. If we have $75k to burn on credits, and it was diverted
> > > > to just pay an admin to admin the real hw + gitlab/CI would that not
> > > > be a better use of the money? I didn't know if we can afford $75k for
> > > > an admin, but suddenly we can afford it for gitlab credits?
> > >
> > > As I think about the time that I've spent at google in less than a
> > > year on trying to keep the lights on for CI and optimize our
> > > infrastructure in the current cloud environment, that's more than the
> > > entire yearly budget you're talking about here. Saying "let's just
> > > pay for people to do more work instead of paying for full-service
> > > cloud" is not a cost optimization.
> > >
> > >
> > > > > Yes, we could federate everything back out so everyone runs their
> own
> > > > > builds and executes those. Tinderbox did something really similar
> to
> > > > > that IIRC; not sure if Buildbot does as well. Probably rules out
> > > > > pre-merge testing, mind.
> > > >
> > > > Why? does gitlab not support the model? having builds done in
> parallel
> > > > on runners closer to the test runners seems like it should be a
> thing.
> > > > I guess artifact transfer would cost less then as a result.
> > >
> > > Let's do some napkin math. The biggest artifacts cost we have in Mesa
> > > is probably meson-arm64/meson-arm (60MB zipped from meson-arm64,
> > > downloaded by 4 freedreno and 6ish lava, about 100 pipelines/day,
> > > makes ~1.8TB/month ($180 or so). We could build a local storage next
> > > to the lava dispatcher so that the artifacts didn't have to contain
> > > the rootfs that came from the container (~2/3 of the insides of the
> > > zip file), but that's another service to build and maintain. Building
> > > the drivers once locally and storing it would save downloading the
> > > other ~1/3 of the inside of the zip file, but that requires a big
> > > enough system to do builds in time.
> > >
> > > I'm planning on doing a local filestore for google's lava lab, since I
> > > need to be able to move our xml files off of the lava DUTs to get the
> > > xml results we've become accustomed to, but this would not bubble up
> > > to being a priority for my time if I wasn't doing it anyway. If it
> > > takes me a single day to set all this up (I estimate a couple of
> > > weeks), that costs my employer a lot more than sponsoring the costs of
> > > the inefficiencies of the system that has accumulated.
> >
> > I'm not trying to knock the engineering works the CI contributors have
> > done at all, but I've never seen a real discussion about costs until
> > now. Engineers aren't accountants.
> >
> > The thing we seem to be missing here is fiscal responsibility. I know
> > this email is us being fiscally responsible, but it's kinda after the
> > fact.
> >
> > I cannot commit my employer to spending a large amount of money (> 0
> > actually) without a long and lengthy process with checks and bounds.
> > Can you?
> >
> > The X.org board has budgets and procedures as well. I as a developer
> > of Mesa should not be able to commit the X.org foundation to spending
> > large amounts of money without checks and bounds.
> >
> > The CI infrastructure lacks any checks and bounds. There is no link
> > between editing .gitlab-ci/* and cashflow. There is no link to me
> > adding support for a new feature to llvmpipe that blows out test times
> > (granted it won't affect CI budget but just an example).
>
> We're working to get the logging in place to know which projects
> exactly burn down the money so that we can take specific actions. If
> needed. So pretty soon you wont be able to just burn down endless
> amounts of cash with a few gitlab-ci commits. Or at least not for long
> until we catch you and you either fix things up or CI is gone for your
> project.
>
> > The fact that clouds run on credit means that it's not possible to say
> > budget 30K and say when that runs out it runs out, you end up getting
> > bills for ever increasing amounts that you have to cover, with nobody
> > "responsible" for ever reducing those bills. Higher Faster Further
> > baby comes to mind.
>
> We're working on this, since it's the boards responsibility to be on
> top of stuff. It's simply that we didn't expect a massive growth of
> this scale and this quickly, so we're a bit behind on the controlling
> aspect.
>
> Also I guess it wasnt clear, but the board decision yesterday was the
> stop loss order where we cut the cord (for CI at least). So yeah the
> short term budget is firmly in place now.
>
> > Has X.org actually allocated the remaining cash in it's bank account
> > to this task previously? Was there plans for this money that can't be
> > executed now because we have to pay the cloud fees? If we continue to
> > May and the X.org bank account hits 0, can XDC happen?
>
> There's numbers elsewhere in this thread, but if you'd read the
> original announcement it states that the stop loss would still
> guarantee that we can pay for everything for at least one year. We're
> not going to get even close to 0 in the bank account.
>
> So yeah XDC happens, and it'll also still happen next year. Also fd.o
> servers will keep running. The only thing we might need to switch off
> is the CI support.
>
> > Budgeting and cloud is hard, the feedback loops are messy. In the old
> > system the feedback loop was simple, we don't have admin time or money
> > for servers we don't get the features, cloud allows us to get the
> > features and enjoy them and at some point in the future the bill gets
> > paid by someone else. Credit cards lifestyles all the way.
>
> Uh ... where exactly do you get the credit card approach from? SPI is
> legally not allowed to extend us a credit (we're not a legal org
> anymore), so if we hit 0 it's out real quick. No credit for us. If SPI
> isnt on top of that it's their loss (but they're getting pretty good
> at tracking stuff with the contractor they now have and all that).
>
> Which is not going to happen btw, if you've read the announcement mail
> and all that.
>
> Cheers, Daniel
>
> > Like maybe we can grow up here and find sponsors to cover all of this,
> > but it still feels a bit backwards from a fiscal pov.
> >
> > Again I'm not knocking the work people have done at all, CI is very
> > valuable to the projects involved, but that doesn't absolve us from
> > costs.
> >
> > Dave.
>
>
>
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> +41 (0) 79 365 57 48 - http://blog.ffwll.ch
> _______________________________________________
> wayland-devel mailing list
> wayland-devel at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/wayland-devel
>
--
Nuritzi SanchezSenior Open Source Program Manager | GitLab
*Create, Collaborate, and Deploy together*
Free Trial <https://about.gitlab.com/free-trial/> | Upgrade Now
<https://about.gitlab.com/products/> | Contact Support
<https://about.gitlab.com/support/> | Community
<https://about.gitlab.com/community>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/mesa-dev/attachments/20200228/df64ec6c/attachment-0001.htm>
More information about the mesa-dev
mailing list