Entitlement server and protected repos support

Fri May 24 13:11:23 UTC 2019

On Thu, 23 May 2019, at 18:30, Damián Nohales wrote:
> On Thu, May 23, 2019 at 9:33 AM Philip Withnall <philip at tecnocode.co.uk> wrote:
> >
> > On Thu, 23 May 2019, at 08:19, Alexander Larsson wrote:
> > > On Thu, May 23, 2019 at 12:56 AM Damián Nohales <damian at endlessm.com> wrote:
> > > > Then we have the P2P case:
> > > >
> > > > 1. A ref is about to be pulled (repo_pull is called).
> > > > 2. The P2P code path is used (we have a collection_id)
> > > > 3. ostree_repo_find_remotes_async is called
> > > > 4. For each finder result, we set a new "OstreeRepoFinderResult {
> > > > GVariant *options }" field that will override the common
> > > > ostree_repo_pull_from_remotes_async's options argument. That new field
> > > > is populated with the Bearer token in the http-headers field of the
> > > > finder result option.
> > > > 5. ostree_repo_pull_from_remotes_async will use the options specific
> > > > to the remote to fetch the objects.
> >
> > I don’t think you want to extend the OstreeRepoFinderResult struct, because it’s specific to a remote, whereas you want to use a different bearer token for each ref/commit.
> 
> Why use a different bearer per ref/commit? the token (at least for
> flat-manager) could be suitable for the entire ref or many refs in one
> remote.

I think I may have been getting my terminology confused between the token which is sent to the token/entitlement server to authenticate the user, and the tokens which are returned by the token/entitlement server to pass to the repository server to prove that the user should be allowed to download individual commit objects.

What terminology do you want to use for these two things? For the remainder of this e-mail I’m going to use ‘bearer token’ for the former, and ‘ref token’ for the latter.

Regarding the OstreeRepoFinderResult struct, I was thinking about ref tokens, which (I understand) can vary per ref, so can’t be put into the OstreeRepoFinderResult struct. In any case, making that struct mutable after it’s returned by ostree_repo_find_remotes_async() would make every use of it more complex. I’d prefer to keep it immutable once constructed, since that makes the struct a lot easier to reason about.

> > Instead, add an option to ostree_repo_pull_from_remotes_async() which maps refs (or commit checksums) to an a(ss) of HTTP headers to use when pulling that commit metadata. Each set of HTTP headers would contain the relevant bearer token header.
> 
> Ok, I like more this idea, you mean a hash table that maps ref +
> remote to HTTP headers?

Essentially, ostree_repo_pull_from_remotes_async() needs to have enough information to be able to pull any ref from any of the various available remotes (servers, P2P peers, USB sticks). So that means it needs to be provided enough information to use any of the ref tokens it’s passed as appropriate when downloading commit metadata from any remote.

Thinking about it a bit more, if we model a ref token as a signed binding between the tuple (collection ID, ref name, token server identifier, user identifier) and some validity interval, then the relevant identifiers to key the map passed to ostree_repo_pull_from_remotes_async() by are (collection ID, ref name, token server identifier), and we assume that the user identifier is constant for all the tokens we’re dealing with at any time.

So, on reflection, I  think I might actually mean: add an option to ostree_repo_pull_from_remotes_async() which maps (collection ID, ref name, token server identifier) to a ref token.
 • The collection ID is needed because ref names aren’t guaranteed to be globally unique, and one repository server could be mirroring a ref from another repository which has the same name as another ref it’s serving, but different content. A good example of where this happens is the `ostree-metadata` ref, which exists in all OSTree repositories but contains different content in each. It’s a P2P thing, but needs to be supported everywhere.
 • The ref name is needed because ref tokens are bound to refs. The combination of (collection ID, ref name) globally identifies an app.
 • The token server identifier is needed because two repositories might offer the same app, but be using different token servers, and hence the tokens issued by one server won’t be valid for the other repository. ostree_repo_pull_from_remotes_async() can download a ref (identified using its collection ID and ref name) from any server which claims to offer it. For example, if (say) flathub and the Endless apps repository both started hosting a Skype app, and both had different token servers configured (the EOS one linked to someone’s EOS account, and the flathub one linked to some online payment provider, for example), we’d want to make sure to only send the ref token to the repository associated with the token server which had issued that token.

> > > I think this will do the right thing for the actual upstream remote.
> > > However, the p2p case is interesting, because what does it even mean
> > > in terms of a private repo? What ensures that some random p2p node
> > > doesn't give you the commits even without the bearer token? Maybe that
> > > node did do an authenticated download, and now its just another commit
> > > in the local repo that it is being nice to the peers on the local
> > > network by giving out.
> >
> > We have three options:
> >  1) Allow P2P peers to forward paid-for apps without any checks. Not really an option.
> >  2) Don’t allow P2P peers to forward paid-for apps at all.
> >  3) Allow P2P peers to forward paid-for apps on receipt of a valid bearer token, with some validation of the token.
> >
> > (2) is probably the easiest option for the first iteration of developing this, but we presumably want to allow (3) in the future. So the bearer tokens need to be designed so that either:
> >  (a) a peer Bob can validate that a token from another peer Alice actually comes from that peer, that it was issued by a token server which Bob trusts, and that it hasn’t expired
> >  (b) Bob can forward a token from Alice to the token server (which Bob must trust) and have the token server validate it
> >
> > (a) would allow full decentralisation, and is probably necessary if we want this to work for machines which are completely disconnected from the internet and which receive their apps over USB sticks. (b) probably isn’t significantly simpler.
> 
> The flat-manager repo relies on validating the token since
> flat-manager is the one signing the token and has the secret to
> validate the signature. It's like the peer has to share the secret
> with the original repo somehow? I don't know if flat-manager supports
> signing with public and private keys, maybe that's a way to transfer
> and validate the token between peers.

Sharing the secret with the original repo is not an option — it would mean the secret is public. I suspect the solution here is for the ref tokens to contain some identifier of the user/machine which they were issued for, and to be signed using the token server’s private key. Then Bob, who has a copy of the token server’s public key, can verify the token server’s signature, and check that the signed user identifier matches Alice’s claimed user identifier. (So the user identifier has to be unforgeable and public. An IP address might work, but other things might be better.)

> I'm not familiar on how this P2P app sharing work, I don't know how
> the peer could "know" that it needs to validate a token before send
> the app. I should probably investigate more on how P2P works.

Basically, each peer exposes its local OSTree/flatpak repositories over the network, just like a normal OSTree repository is hosted. The peer uses DNS-SD (Avahi) to advertise which refs it’s hosting, then other peers download them using HTTP over the LAN.

Please don’t think of P2P in OSTree as a special case. While its deployment may be low at the moment, architecturally the ‘normal’ way of downloading refs from a server on the internet is a special case of P2P. It all uses the same code and concepts. P2P support cannot be bolted on to a feature after it’s written.

> >
> > So, I think (and please check this carefully — it’s only initial musing), this means that the token needs to:
> >  • Contain a public identifier for the user it’s generated for, which can be checked by peers of that user’s machine. Perhaps the machine’s IP address, for example.
> >  • Contain a public identifier for the token server which generated the token (so that peers can check whether they trust it).
> >  • Contain an expiry timestamp on some global clock.
> >  • The entire payload is signed by the key associated with that token server.
> 
> In the current implementation. The token it's not generated by the
> token-server but it's requested to flat-manager through the
> /token_subset API, what it does is to tell flat-manager to create a
> permission limited token that is a subset of the permissions of the
> token that the token-server uses to communicate with flat-manager. So
> the only one able to validate the token, is the remote, through
> flat-manager.

I think that might have to change, then.

Philip