Is anyone using p2p ostree support?

Mon Feb 24 20:42:40 UTC 2020

On Fri, Feb 21, 2020 at 1:32 AM Alexander Larsson
<alexander.larsson at gmail.com> wrote:
>
> On Thu, Feb 20, 2020 at 9:11 PM Matthew Leeds
> <matthew.leeds at endlessm.com> wrote:
> >
> > On Mon, Feb 17, 2020 at 11:56 PM Alexander Larsson
> > <alexander.larsson at gmail.com> wrote:
> > >
> > > I'm at a point where I'm starting to reconsider the support for the
> > > p2p ostree setup in flatpak. We keep running into issues with it,
> > > which means we haven't been able to enable it by default on flathub.
> > > And it vastly complicates the code as well as the behaviour of flatpak
> > > on the network, turning a simple natural workflow into something that
> > > is super-hard to reason about because information comes from several,
> > > partial and untrustworthy sources.
> > >
> >
> > I certainly agree that the P2P (LAN/USB) support in Flatpak adds complexity and
> > technical debt, and it would be great to find ways to reduce that. But I'm not
> > aware of any current issues that prevent us from pushing it out to everyone on
> > Flathub; what did you have in mind?
>
> I don't actually know of any outstanding issue with p2p right now, but
> many times we've been about to switch it on by default some issues
> have arrived to block it. I also know that not many people use p2p
> enabled and nobody does so with a non-trivial setup (i.e. actually
> using mulitple peers), so I'm sure there are issues hiding in there.
>
> However, if it were just for simple bugs I think they would just get
> fixed over time. My problems with p2p is deeper than that. I can
> explain to anyone how a regular ostree repository works and how
> flatpak uses it, and it seems natural and understandable. However once
> p2p gets added to the mix even I don't *really* understand it, I have
> to constantly refer to the sources to see exactly what is going on.
>
> In addition, I think the design of the ostree-metadata branch and how
> it is used on the client side is quite bad. We have N potential peers
> with different versions of this branch, which may or may not
> correspond with the commits of the refs they actually carry. Somehow
> we pick one version of this and pull it to a single local repo, that
> can be updated at any time, even in the middle of an operation (since
> flatpak instances can run in parallel).

Right, this is the problem I suggested a potential solution to in my last
email. But I agree that even if we fixed that a lot of complexity would remain
and it's difficult to reason about.

>
> I just find it impossible to reason about the behaviour of this.  In
> comparison, the regular case we have simple one-file summary updates
> that are atomic on both the server, locally on dis and in memory as
> part of a transaction (i.e. you load one summary file and use that for
> an entire operation).
>
> Every time I add a feature in flatpak the regular version takes a
> short time, but then I have to spend much longer time making the p2p
> code work. Take the authentication callback stuff for instance, I had
> to mangle the API in weird ways because of the complex p2p
> handling. We had to emit the signals multiple times in unnatural ways
> (whereas the regular case would have a natural emission point at the
> begining of the op) and there are even some cases that the p2p setup
> cannot handle.
>
> I want to keep flatpak extensible and maintainable going forward, and
> I think the current p2p approach is going to make that harder. So,
> while p2p is not yet widely deployed we should take the time to
> simplify it to what is actually used in practice in order to ensure
> the longevity of the flatpak design.
>
> So, the question them becomes "What is actually used in practice", and
> reading your mail and other responses what I've understood this is two
> things:
>
>  * Sideloading apps/runtimes from a local drive, when the network is
>    unavailable or slow. For example Endless preloading software or
>    installing from usb stick or Fedora wanting to store flatpaks
>    on the install media.
>
>  * Having a local network server that mirrors part of a flatpak
>    repository to avoid issues with slow or unreliable upstream network
>    connections. This would be maintained by some kind of sysadmin,
>    not just a random client machine.

I doubt we want to develop server software to serve this purpose, because
having a sys admin available (especially physically co-located) is a
significant requirement that might be untenable for many deployments. I think
we would instead want an architecture that allows the server to be a normal
Endless OS desktop, perhaps with some special configuration. We already have
`eos-update-server` for the purpose of serving the local ostree repo to LAN
peers (https://github.com/endlessm/eos-updater/tree/master/eos-update-server).

>
> The first case is actually in use, the second is not actually done
> atm, but seems a reasonable usecase (in comparison with the blue sky
> full peer-to-peer with multiple untrusted sources).
>
> In the first case, we have a locally availible, partial repo, with
> collection id and ostree-metadata branch, and we can assume that the
> metadata branch is in sync for all the refs in it. Sideloading from
> this seems like a much easier problem than the p2p case. We don't
> have to pull any ostree-metadata branches into the local flatpak
> repo, we can just access it as is.
>
> Here is proposal for how this could be handled:
>
> We have a remote "flatpak", it does not have a regular collection id
> defined (so we use the normal pull codepaths etc for it). However, it
> has a custom config option "xa.sideload-collection-id" which specifies
> a collection id that all the commits in the remote repo ic created
> with. Additionally we have a separate global configuration option
> "sideload-paths" which has a list of pathnames, where each path
> contains an ostree repo in p2p style. These repos are assumed to
> be internally consistent (i.e. each repo has a ostree-metadata and
> summary file where the commits matches in time), but not necessarily
> consistent with each other or the upstream repo.

To be clear you're suggesting we set e.g.
`xa.sideload-collection-id=org.flathub.Stable` instead of
`collection-id=org.flathub.Stable` on the flathub remote?

For what it's worth, if we wanted to make the simplifying assumption that you
only install things from one flash drive at a time, that would likely be a
reasonable requirement.  To make the `sideload-paths` config option tenable, I
think we'd need a system component (ideally a service shipped with Flatpak?)
that would run when a flash drive is inserted which has an ostree repo on it,
and sets that config option (relatedly see
https://gitlab.gnome.org/GNOME/gsettings-desktop-schemas/-/merge_requests/30).
Because of course it's not reasonable to expect a user to manually set that
themselves. I don't know enough about how the Fedora installer sideloads apps
to say if such a scheme would work for that case.

This seems like it would be a lot more efficient and sensible than the status
quo where we scan every mounted filesystem every single time we want to resolve
a ref.

>
> All the action now happens during the FlatpakTransaction
> resolve_all_ops() call, where we figure out what exactly to download
> and from where, given a set of ref+remote pairs.
>
> In the non-p2p case what happens (for a particulare ref+remote pair) is:
>
> We ensure the local copy of the remote summary is updated, then we load
> it and use for the entire operation. The ref is looked up in the
> summary to see what the current commit is, and if it is newer than
> what we have locally we "resolve" it by getting the additional
> metadata which is also in the summary file.
>
> For the p2p case what happens is:
>
> Create a temporary local ostree repo, child of the main flatpak
> one. In parallel, for each possible source of information for the
> collection-id that is configured for the remote, download the summary
> file and look up the requested ref. From all the summaries that had
> the ref, pick the one that is most recent, this is the commit we're
> getting, but we also need the metadata. Now we do an ostree pull
> operation with the COMMIT_ONLY flag from that peer, then load the
> resulting commit object from disk get the metadata from that. Then
> throw away the temporary repo, but we remember which actual source
> we're pulling the ref from when doing the install.
>
> The above is simplified because it only discusses a single ref/remote
> pair. In practice the code is much more complex because it tries to
> resolve all the refs at the same time, and even more so since we added
> the token-callouts for authentication which happens in the middle of
> this. But even with this simplification its obvious how much more
> fiddly the p2p case is compared to the "read a single file and do some
> metadata lookups" of the non-p2p case.
>
> What I propose instead is:
>
> 1) Ensure the local copy of the summary is updated (if possible, i.e
>    online) and load it into memory
> 2) For each configured local side-load repo, load its summary and
>    ostree-metadata branch into memory
> 3) Look up the ref in all the summaries, disregarding any lookups from
>    side-loaded summaries that doesn't match version with the
>    corresponding ostree-metadata.
> 4) Choose which source to resolve from, based on which commit is
>    latest (prefering sideloads to remote if the same), and taking into
>    consideration whether we're online or not.
> 5) Get the metadata for that commit either from the summary (for upstream)
>    or from the ostree-metadata (for sideloads).
>

I don't think it affects the correctness of the rest of this, but the side-load
repo would have not one ostree-metadata branch, but one per collection it's
serving: (org.flathub.Stable, ostree-metadata), (com.endlessm.Os,
ostree-metadata), etc.

These steps sound to me like the implementation we already have for
OstreeRepoFinderMount, with two differences: (1) check only configured paths
rather than all mounted filesystems, and (2) ensure the ostree-metadata used is
in sync with the commit used. (1) could be done in libostree as an option given
to ostree_repo_find_remotes_async() and for (2) there are a couple options:
(a) if LAN mirroring is handled separately (not using OstreeRepoFinderAvahi)
and you only allow one local side-load repo, you either use the ostree-metadata
from the side-load or from the Internet depending on if you're online and it's
guaranteed to be in sync (and for the case where you're online and only want to
use the side-load repo for some refs, maybe do two transactions)
(b) if we want to support an arbitrary number of side-load repo paths, Flatpak
could pull ostree-metadata from each one that provides the latest commit of one
or more refs, and combine the xa.cache data from each such that if repo X
provides the latest commit for ref Y, for that ref we use the metadata from
xa.cache in the ostree-metadata commit from repo X.

Perhaps it's worth noting that the p2p api of libostree is not just used by
Flatpak but also by eos-updater (for LAN/USB OS updates). If Flatpak is going
to stop using it, we should consider whether eos-updater will continue to use
it or if it will be deprecated entirely. It would not make much sense for
eos-updater and Flatpak to each have separate implementations of support for
offline updates.

> This is slighly trickier than the original non-p2p case, but it is
> still just loading a few local files and doing in-memory lookups from
> them. Its nowhere near the complexity of the full p2p case.
>
> Additionally, when using p2p we use the ostree-metadata instead of the
> summary for operations that need a list of refs (like e.g. tab
> completion), because the summary files from p2p peers are typically
> partial. In an offline, initial sideload case we could end up in a
> situation where we have no local copy of the complete summary
> file. I propose to fix this we also let the sideload repos carry a
> copy of the upstream complete (signed) summary file, and we use this
> to update the local copy of the summary file in case we're
> offline. This means we can completely avoid pulling any
> ostree-metadata branches ever in the flatpak ostree repo.
>

Depending on how this is implemented it could be misleading. You would
definitely not want to use the remote summary to find the commits available in
the side-load repo, since it could not be guaranteed to be in sync (it might
specify a later commit for a ref than is in the side-load repo if "flatpak
create-usb" was used to pull into it more than once). And you'd also have to
distinguish the commit specified in the remote summary from the commit actually
available from the remote, since you might be offline and have only a subset of
the refs available via the side-load repo.

> This solves the side-load usecase. For the local mirror setup I think
> a more natural solution is to avoid the complexities of collection-ids
> and just set up a proxy-like server that mirrors the entire upstream
> repo. It doesn't have to *actually* keep a copy of everything, instead
> it would work like a CDN doing pulls from upstream when needed. In
> case of spotty network this means that sometimes apps look like they
> are available but fail during installation. However, if some
> particular apps are needed (for e.g. a school) the sysadmin could
> chose to manually pre-seed those. Then you just point your remote at
> this (or even pick it up from the http proxy config).
>
> This kind of mirroring setup is *much* easier to do in a stateful,
> managed server like this than in the client library.

That sounds pretty reasonable, though doesn't HTTPS make it not really possible
to proxy things? I think we need to also consider the completely offline case,
but if the UX there is just that installs fail for apps that aren't available
on the proxy, that might be acceptable.

--

Matthew Leeds