[Freedesktop-sdk] ABI stability and release model

Robert McQueen rob at endlessm.com
Wed Mar 27 12:37:10 UTC 2019


Hi Tristan and team SDK,

With my Flatpak hat on, I must say: offering ABI stability (and really
solid, boring, ABI stability) in the runtimes, is absolutely
instrumental to the whole Flatpak ecosystem. If we don't have that, we
may as well not have an SDK, and practically speaking, we won't have a
Flatpak app ecosystem at all.

But why? Supporting stable ABIs is boring, hard and expensive, and
don't we have containers now so we don't need LTS? Why can't I have a
pony and update it constantly and never maintain any software ever?

Of course containers have hugely disrupted the platform space, in a
very good way, and I am a huge fan of this disruption. Endless delivers
it's OS as a single immutable ostree, has precisely two supported
release series (our stable branch, and stable branch with some
backported hardware drivers) and we support each release for at most a
month before we roll out an update to the next one. It's fantastic.

What allows us to do this is that the OS is no longer responsible
for the ABI that the apps rely on, so we can run footloose and fancy
free across the hills. Between the OS and the apps, we have a very
small and well-defined interface between the OS and the containers -
basically Linux syscalls, our display/sound/IPC/etc sockets, and the
portal APIs. Inside of the containers however, the need for stable and
supported platform ABI between layers (in docker) or between runtimes
and apps (in Flatpak) is alive and well.

Similarly, Red Hat has seen the writing on the wall for the concept of
LTS on the bare metal, and are moving hard to secure the "container
host" with coreOS and it's desktop cousin Silverblue. But the main
value in RHEL is in a guaranteed ABI and an ecosystem around it - and
surely they would like to transfer that value to the _inside_ of the
container. They are also working on building runtimes out of Fedora and
I imagine, might also work on RHEL runtimes as well when the time
comes.

By having more numerous packages and an ecosystem grown in public
rather than behind the RHEL paywall, Ubuntu has gained it's traction in
the cloud not as the host but as the container OS, and which is the
most popular of these? The LTS ones, of course - because this
alleviates the need for the application developers to handle
churn/breakage in their platforms and get on with their value-add.


All this to say: outside of the container? LTS is dead. Go crazy.
Release daily, etc. Within a container? Unless every app maintainer
maintains their own platform, which I don't see the tooling or desire
to do, LTS is very much alive, and ABI guarantees are king.

In the Flatpak universe, the SDK has this responsibility that an app
built against "18.08" can run reliably against any other release of
"18.08", and without meaning to sound too apocalyptic, if we don't do a
good job of it, we will kill the Flatpak ecosystem's value to incoming
ISVs and poison our reputation with the radioactive fallout of platform
instabilities that have made Linux such a hostile place for these
vendors in the past.


So... on tickets such as https://gitlab.com/freedesktop-sdk/freedesktop
-sdk/merge_requests/1031 I see arguments like:

1) App developers want new stuff.

Yes, some do. Particularly in GNOME and KDE there are fast release
cycles that bring new platform changes on a 6-monthly cycle. However,
this can't be done at the expense of the basic stability/usability of
an app developer who isn't currently writing/incorporating new stuff,
and instead just wants the app they released last week to still be
installable and work next week. This suggests you need "boring"
releases that last longer, and "more exciting releases" which are
updated more often and made available for the more adventurous souls.

2) Users want new stuff.

For a platform/SDK - this is fairly absurd. Users want the new
applications, and they want it to work. They don't need to know or care
about runtimes or SDKs, this is a contract/choice between us and the
application developers. If end users they hear about the SDK project,
it will almost certainly mean we've messed up somehow - the binaries
are too lager, we update too often, or we've broken something. If we
teach people to be wary of platform updates, we have failed hugely.

3) We need new Mesa for hardware, so we need new LLVM, so we have to
break everything all the time.

Yes, maintaining any form of interface stability is expensive and hard
and if we could avoid it, we all would. However, we _must_ find a way
to handle these "escalating requirements" such as hardware support
without the stability of the platform being impacted. If we don't
actually want to keep bringing in new platform/SDK components to keep
bringing newer Mesas in, maybe we need to rethink this corner
completely.

Eg - we redouble efforts on bringing the platform drivers in (and there
is a lot of technology in this space - the likes of https://git.collabo
ra.com/cgit/user/vivek/libcapsule.git  and even stuff like virgl), or
we could build the Mesa extension as it's own Flatpak (as per 1.6 used
to do) and then it would never touch the SDK.

4) We don't have any CI for our master branch so we're changing
everything in the stable branch instead.

Honestly, this is the worst of all the arguments. :P https://dilbert.co
m/strip/1995-06-24 Flathub receives $100ks a year of free compute and
CDN resources because I tweeted and asked for them. We can absolutely
put our heads together between Flathub, GNOME, freedesktop.org and the
SDK and solve CI/etc resources.


In setting up the release cycle for the SDK and managing the cost of
running an LTS, we can only play with two variables. The frequency of
releases, and the lifespan that we support them for. My understanding
is that we would do basically: one year, one year. It seems like
perhaps this needs revisiting if we are seeing new developer
tools/technologies that need to be updated more frequently because of
app/runtime developer needs? Maybe 6 month frequency, 1 year support,
or to avoid too many parallel series, we have a tick/tock with a 1 year
support and a shorter one in between.

But ABI stability is not something we can trade, because it voids the
purpose of the project. I totally agree with Alex: https://gitlab.com/f
reedesktop-sdk/freedesktop-sdk/merge_requests/1031#note_154558939

(Of course we have the "safety net" that withdrawing support for a
runtime doesn't mean the app stops working overnight. It means we can
set the frequency and lifespan higher than we otherwise would if we
ever wanted an app developer to stick with us, because the risk of an
app running against older runtime is partly mitigated by sandboxing and
diffused by the plurality of such runtimes/versions/etc across the apps
that are used.)

The imperative thing to me is that we a) decouple Mesa updates from any
risk of breaking/modifying the platform, and b) open up a "next" or
"master" branch where changes can be developed/evaluated/etc and tested
with eg the upcoming GNOME/KDE platform and other apps, without risking
the stability of the stable branch. Set up ways to do test builds of
apps and runtimes against it, etc. (It's much easier to set up a
"shadow" Flathub buildbot - which we've done once or twice this year
around infrastructure upgrades and migrations.)

We need clear lifecycles and guarantees around ABI stability for the
Flatpak ecosystem to rely on this project as the basis of our stable
runtimes. Yes this is boring. Boring is good for a platform.

Cheers,
Rob

On Fri, 2019-03-22 at 16:11 +0900, Tristan Van Berkom wrote:
> Hi all,
> 
> It was raised recently by Cameron on this list that there are some
> problems with ABI stability which effectively break things for end
> users.
> 
> Asides from this, I think it is also worth considering concerns about
> overusing end users bandwidth (to install new updates) and similarly,
> the churn and compute resources we impose on downstream projects who
> need to rebuild as a result of releasing new things frequently.
> 
> As this is all related to the current release model, I think we can
> more effectively resolve these issues if we take a step back and
> reexamine our release model and consider its side effects.
> 
> Related links can be found at the end of this mail.
> 
> 
> Broken ABI stability
> ~~~~~~~~~~~~~~~~~~~~
> This issue Cameron has highlighted is quite specific and in depth, so
> instead of discussion that directly, I would prefer to boil this down
> to a simpler hypothetical example that is easier to digest.
> 
>   A.) End user installs a Flatpak, and the current 18.08 release
>       of the freedesktop-sdk runtime is also installed.
> 
>   B.) Upstream freedesktop-sdk releases a new 18.08 release, which
>       is fully backwards compatible.
> 
>       In this new release of the runtime, new features/symbols are
>       added in some libraries.
> 
>   C.) A new Flatpak is built against the new 18.08 release,
>       and this application uses some of the new symbols in libraries
>       which were introduced in the latest 18.08 release.
> 
>   D.) The end user installs this new Flatpak.
> 
>       When the user tries to run this flatpak, the application cannot
>       be loaded because the user's runtime does not contain the new
>       symbols which are required by this new application.
> 
> To compound matters further, in Cameron's case we are talking about
> an
> extension point that is released directly from freedesktop-sdk, which
> needs to function when combined with the KDE runtime, which has not
> been rebuilt and released yet - which means that even if the user had
> wanted to update to a new runtime, a new runtime was not available to
> the user at the time.
> 
> To summarize my understanding of the current setup:
> 
>   o Runtimes, applications, and extension blobs are all distributed
>     separately.
> 
>   o These rely only on the "18.08" version (either directly or
>     indirectly via a derived runtime like GNOME/KDE) to indicate that
>     these components are intended to function properly when combined
>     together on an end users' system.
> 
>   o Our current model which considers backwards compatibility only,
>     allowing addition of new symbols in the same 18.08 version of the
>     runtime.
> 
>     This means that it is possible to install incompatible
> combinations
>     of the Runtime/App/Extension tuple.
> 
> 
> My thoughts are that on our side, we must provide a guarantee that a
> user cannot install an incompatible runtime/app/extension
> combination.
> 
> Am I missing some important details ?
> 
> What does the community think about this, what should we do to ensure
> that we don't break end users ?
> 
> Cheers,
>     -Tristan
> 
> 
> PS: Some links to related discussion follow here...
> 
> 
> Our documented release model:
> https://gitlab.com/freedesktop-sdk/freedesktop-sdk/wikis/release
> 
> Ensure ABI stability of extensions:
> https://gitlab.com/freedesktop-sdk/freedesktop-sdk/issues/669
> 
> A lot of discussion in the above is on the associated merge request:
> https://gitlab.com/freedesktop-sdk/freedesktop-sdk/merge_requests/103
> 1
> 
> Cameron's mailing list report of ABI break:
> https://lists.freedesktop.org/archives/flatpak/2019-February/001482.h
> tml
> 
> Make releases less often:
> https://gitlab.com/freedesktop-sdk/freedesktop-sdk/issues/674
> 
> Avoid adding new elements in stable branch which are exposed in
> platform/sdk:
> https://gitlab.com/freedesktop-sdk/freedesktop-sdk/issues/689
> 
> _______________________________________________
> Freedesktop-sdk mailing list
> Freedesktop-sdk at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/freedesktop-sdk


More information about the Freedesktop-sdk mailing list