[Telepathy] Designing Telepathy/XMPP end-to-end security

Tue Jun 12 06:42:35 PDT 2012

Now that I've stated requirements, here's how I think the design options
look.

There are two broad use-cases: Text, and the rest. In this mail I'm only
going to consider the non-Text case, because Text is more complicated.

Right now, the non-Text case includes VoIP, because the Jingle XEPs
specify DTLS and SRTP as an optional security layer.

I would like it to include 1-1 Tubes and file transfers, which I think
should be done by converting 1-1 Tubes into a Jingle application, and
supporting XEP-0234 (Jingle) file transfers. We could optionally also
support XTLS for "Google Share" (HTTP over pseudo-TCP over UDP over
Google-P2P, where Google-P2P resembles an early version of Jingle), but
I don't see much advantage in that, since the point of supporting Google
Share is to interoperate with Google Talk, which seems unlikely to add
end-to-end encryption.

I do not propose to try to secure Stream Initiation: I think we should
just switch to Jingle and secure that, instead.

OTR is not directly applicable, because OTR acts on message body text
rather than on arbitrary XML stanzas. In principle OTR could be extended
to support tunnelling an XML stream through message body text, but that
seems like a layering violation to me, and it's not clear to me that it
has advantages over XTLS/DTLS in any case ("strong deniability" via
leaking MAC keys is the only unique feature, and that seems rather
tenuous as explained in a previous mail).

I think we can simplify by assuming that only one security technology is
supported (XTLS for future Jingle-enabled Tubes and file transfers; DTLS
+ SRTP for VoIP).

A Telepathy Channel corresponds to one Jingle session[1], so if we wish
to negotiate end-to-end security using XTLS or DTLS, we must either do
so as soon as the Channel opens, or have a specific "channel A replaces
channel B" migration step: we may not start an insecure session and then
seamlessly "migrate" to make it secure.

However, we might be able to start with a channel that is secure against
passive attackers (but vulnerable to a man-in-the-middle attack), and
verify that a man-in-the-middle attack has not taken place (i.e.
authenticate the peer) while application data is already flowing. This
makes little sense for file transfers (the unauthenticated peer already
has part of your secret file), but is relevant to voice calls
(authenticate the peer before you start talking about secrets).

Here's a design sketch (assume that Alice and Bob are both Telepathy users):

* Alice requests a Channel of the appropriate type, to Bob.
  Alice is the InitiatorHandle in Telepathy, Alice's Channel
  has Requested=TRUE, and Alice will be the initiator in Jingle,
  and the "client" in TLS.

* Alice may include properties in the channel request to indicate
  that she requires, or does not require, particular security
  properties. If she does not, the connection manager implementation
  must choose whether to switch on end-to-end security or not.

* If end-to-end security is not enabled after that decision
  (not requested and not on by default, or explicitly disabled),
  the channel proceeds as it does in current Telepathy. We assume
  here that end-to-end security was selected.

* Optionally (open design question: do we need this?), Alice's
  connection manager could pop up a Requested=FALSE channel
  representing the request "give me the certificate you want
  me to use" and containing a pointer to the "payload" channel,
  to be answered by Alice's system certificate store
  (e.g. gnome-keyring/seahorse) or interactively. If the channel
  is closed without the question being answered (probably by
  Mission Control because there is no Handler), or if
  we decide that this question is unnecessary, the CM must
  choose a cert automatically, perhaps by generating its own
  self-signed cert, or by having the cert to use be one of
  the properties in the channel request. The session-initiate
  cannot be sent until Alice answers or this channel is closed.

* Alice's connection manager sends a Jingle session-initiate. Bob's CM
  receives that session-initiate and parses it. If Bob's CM does not
  support the end-to-end security protocol, the channel proceeds as it
  does in current Telepathy (and might just terminate before it was
  established, if Alice's CM is configured to insist on end-to-end
  security). We assume here that Bob's CM does support a compatible
  protocol.

* Optionally (open design question: do we need this?), Bob's
  CM could pop up a channel representing the request "give me
  the cert you want me to use". This is the same as when Alice
  was asked. The Jingle session-accept cannot proceed until Bob
  answers; neither can the TLS handshake over the out-of-band
  (ICE) data connection.

* Bob's CM sends a Jingle session-accept to Alice. In parallel,
  the two CMs perform the TLS handshake.

* If the CMs were using certificates for authentication,
  at this point they can compare the claimed fingerprint
  from the Jingle XML with the actual certificate in the TLS
  flow. This means the TLS flow is at least "as secure as IM":
  if Alice and Bob know that they have hop-by-hop security
  from client to server (common), and between their servers
  (rare), then the out-of-band TLS flow is also secure.

* If Alice has requested protection from active (man-in-the-middle)
  attacks, i.e. Conn.I.Securable.Verified=TRUE, then her CM pops
  up a channel analogous to the current Chan.T.ServerTLSConnection,
  representing the question "do you trust that this is actually Bob?".
  It could either use certificates or (where applicable) TLS-SRP.
  Optionally (design question: should it?), this channel could
  pop up regardless.

* Because Alice might be planning to authenticate Bob informally
  by recognising his voice, the verification channel probably
  needs a method with the semantics "not verified, but let
  application data flow anyway, and I'll make up my mind
  later".

* If Bob has configured his CM to always request protection from
  active attacks (design question: should this option exist?),
  he gets a similar verification channel.

* Assume that Alice and Bob are happy with an unverified
  "leap of faith" for the moment. Application data flows
  and they communicate (e.g. VoIP).

* If the application payload is suitable (e.g. VoIP or
  Tubes, but not file transfer), Alice may wish to "upgrade"
  to a verified channel later. If the verification channel
  is still open, she can inspect its properties and call
  its methods. If not, she can call some method on the
  payload channel (probably on the Securable interface)
  to open a new verification channel. Equivalently, Bob can
  do the same.

* If Alice and Bob used a "leap of faith" at the beginning of the
  conversation, there should be some way in which their clients
  can store the certificate fingerprints for future use, avoiding
  the need for "leap of faith" in future, in favour of SSH-style
  key continuity. I think this should be in the UI layer (e.g.
  cert pinning in gnome-keyring/seahorse), not the CM.

[1] When we do file transfer, one Jingle session can be shared between
multiple Channels (for the "offering multiple files" use-cases of
XEP-0234 or Google sharing), but each Channel still maps to a single
Jingle session.