Flathub repo management proposal

Thu Sep 6 03:24:40 UTC 2018

There has been some discussions before on how we could make the
flathub "repo" handling better, but there was not a lot of details,
and it was never written down in public, so I spent some time thinking
about this and writing down a proposal. I realize this is the flatpak
list but we don't really have a good public list for flathub discussions.

This is a proposal for how to handle the "repo" machine at
flathub. The main goal is to improve performance, scalability and
maintainance of the repo handling. We also want to decouple it from
buildbot as a first step in allowing other systems to push to the
repo.

Here are some more technical goals we have:
 * Allow builders to upload build results directly to the repo, avoiding
   a roundtrip via master. The result should be queued until all builds
   succeed.
 * New builds should end up in a testing repo, where they can be tested
   before being deployed to the "live" repo.
 * Allow test builds (i.e. those from a PR) to be deployed somewhere
   (temporarily) so the result can be tested.
 * Try to make the summary file smaller in order to scale better to more refs.
 * Move generation of deltas onto some set of workers, allowing this to
   scale better and avoid loading down the repo machine.
 * Batch the updates of the summary file, to avoid it changing too often, as
   that will cause clients to repeatedly re-download it.

The implementation will be a server running on the main repository
machine, which allows some pre-configured set of clients to connect to
it (authenticated). The server is similar to
endlessm/ostree-upload-server, but a bit more transaction
oriented.

The server will maintain two repositories, which i will call "stable"
and "testing" below. The stable repo is basically what we have now,
but the testing repo is a staging area where builds can be tested
before being made public.

The testing repo uses the same branch names as the stable one
(i.e. app/org.the.app/x86_64/stable) so that e.g. runtimes and
extensions work, but doesn't have any static deltas and is more
heavily pruned. It may also have a different history than the stable
repo (if some build was tested, but never made public). The actual
repos should be on the same filesystem so that imports between them
can be hardlinks.

The public methods on the server are:
(note: most of these are protected and can only be called if authenticated
 as a privileged user such as the buildbot master.)

 OpenTransaction() -> TransactionID
   This opens up a new transaction on the server, allocating an ID to
   later refer to it.
 UploadBundle(TransactionID, bundlefile):
   This adds a ref to the transaction, by uploading a bundle containing
   all the files in the ref. All registred worker are allowed to do this,
   as the idea is that each builder uploads directly to master.
 Abort(TransactionID):
   Deletes all partial results from the transaction, and denies any
   outstanding or new UploadBundle() calls. master calls this if the build
   fails on some arch.
 Commit(TransactionID):
   This seals the transaction, disallowing further updates. It then
   imports (and verifies) all the bundles into the testing repo, and while
   doing so modifies the commits with the proper collection id, a new timestamp,
   transaction id, and the right end-of-life status, and a testing gpg signature
   (not the stable one). We also record with the transaction the exact new
   commit ids that was generated. The summary is updated (but not deltas or
   appdata) so that users with the flathub-testing remote configured can now
   install the build.
 Publish(TransactionID):
   Imports the refs from the (commited) transaction into the stable,
at the exact
   commit ids that were recorded when commiting (i.e. ignore any new
builds). Then
   trigger the generation of deltas and updates of appstream/summary.
   The import of the  objects and modification of the ref file are
synchronous, but the
   rest happens asynchronously.
 RevertTo(TransactionID):
   Similar to Publish(), but takes an existing commited transaction in
the stable repo
   and creates new commits based on the commit ids in that. This can
be used if a new
   build turns out to be fatally broken and must be immediately reverted.
 ExportBundles(TransactionID):
   Make all the bundles from the transaction visible on a webserver
somewhere so that
   they can easily be downloaded and tested. This is used instead of
Commit() for the
   test-builds of pull requests.

There is also an internal API where workers can connect and subscribe
to requests to
do work, which at least initially, means creating deltas for particular set of
commits in the repo. The worker should be able to pull these commits
by id, even if
they are not published (because the summary file is not updated) by
pulling the commit
by id instead of by-ref.

The work system for delta creation and summary/appdata update is
asynchronous, and we
shouldn't be updating the summary file too often, and ideally not
before all deltas
for the new commits are done.

In order to make the summary smaller, there has been a few ideas, but
none of the are easy, or backwards compatible. Instead of doing
something complicated I propose that we split the flathub repo per
architecture. The current flathub summary file is 4 megs (2 megs
compressed), and essentially every object in this is duplicated for
all 4 arches. If we split it to one per arch this would be divided by for.
Note, this should suport multiarch properly. I.e. the x86_64 repo should
contain the i386 ref iff there is no x84_64 version of the ref.

To implement this I don't actually propose we keep 4 repos, because
that is a lot of pain to synchronize, and loses any deduplication
between arches.  Instead I propose that we use some custom code to
generate the per-arch summary files, and then create 4 per-arch repo
dirs with the custom summary files but symlinks to the deltas and
objects directories so that it is shared with the complete
repo. Ideally this should be somehow linked up on the http level
rather than fs symlinks so that the cdn mirrors these objects as the
same rather than 4 copies, but i'm not sure that is doable without
clients having to manually follow redirects for each object.

Most of the above should be possible to implement by just calling out
to ostree/flatpak, but
the splitting the summary needs some custom C code. We also need some
features in flatpak:

 * build-commit-from needs to be able to take a ref+commit so that we
can publish a transaction
   even though a newer build has been comitted.
 * .flatpakref files should have an optional set of per-arch repo urls
which are used in preference
   to the main on on that arch. We still have the uri to the entire
repo for backwards compat.
   The idea is that you can:
     flatpak remote-add --arch=arm flathub-arm flathub.flatpakref
   If you want to get arm binaries for cross builds.