[Telepathy] Folks status, the addressbook problem

Xavier Claessens xclaesse at gmail.com
Tue Oct 30 07:40:22 PDT 2012


I've recently looked again at Folks performance and it's not good. I've
been working on N900 and N9 addressbooks and they are not really good
neither, even if we achieved in both cases good enough performance for
real commercial product. But can we make Folks the perfect solution if
we take the time to do it properly?

First, some definitions (taking Folks words):
 - a "Store" is a source of contacts (e.g. Telepathy, EDS)
 - a "Persona" is a wrapper to expose all info we have about a contact from a
   given source (e.g. TpContact, EContact wrappers)
 - an "Individual" is a wrapper to expose all info we have about a set of one or
   more personas.

Addressbook uses cases
======================
1) The Contact list app, it needs to load all contacts, let user (un)merge
   some of them.
2) The Chat/Call/FileTransfer/etch apps, they need to load only one, or small
   subset of individuals.
3) It needs to scale with 5-10k personas in some stores. Blocking 2s to display
   an incoming call window is not an option. Pre-loading the full addressbook in
   multiple apps is suboptimal as well.
4) We want to be smart to implicitly merge contacts, but user always has the
   final word and could unmerge some contacts we decided to merge.

State of Folks
==============
Please correct me if I got something wrong, I don't understand everything in
folks...

The good:
1) It has nice API to represent a Persona and Individual, with common interfaces
   to expose each info (alias, presence, caps, name, etc).
2) Each store has Persona subclasses to give direct access to the underlying
   service object. So from a Persona you can actually get the TpContact.
3) It has an aggregator to create individuals based on all the personas from all
   stores. So contact list app can easily get a list of individuals.

The bad:
1) It does implicit merging based on some persona fields. So if 2 personas
   (possibly from different stores) have the same email address they will be on
   the same individual. This is bad because to figure out all the personas of an
   individual it must parse the full vcard of ALL personas to match the email.
2) If user decide to merge 2 individuals, the information that their respective
   personas belongs together is arbritary stored in EDS as vcard field. Again
   this means that to figure out all the personas that belongs together you need
   to parse ALL vcards.
3) Telepathy backend does some obscure offline caching. I'm not sure any app
   actually use that, and I'm not even confident that works correctly.

The XMPP VCard problem
======================
With XMPP, we need to explicitly request vcard or each contact. This is not
something we want to do for all contacts each time a jabber account connects.

I think this is an XMPP specific issue that should be fixed in gabble. We are
moving Avatar cache in CM (TpAvatarsMixin), we are also considering doing an
nickname cache in CM (TpNamesMixin). I think we should do a full vcard cache in
gabble.

Note that I'm not speaking about cache when offline here (even if implementation
could be shared, see next topic). I'm speaking about a way to have meaningful
(even if outdated) info from tp_contact_dup_contact_info(). IIRC Skype is
already good at this since it push the full vcard, right?

Offline Telepathy account problem
=================================
To have the same roster when we are online and offline, we need to cache
telepathy roster. If the email address of an individual comes from a TpContact,
it makes no sense to loose it when we go offline.

A somewhat related problem is incremental roster download in CM. In both case
we actually need a representation of the telepathy roster on disk. So I suggest
CM should store on disk the full roster in a format/location defined in the
spec. That way folks' telepathy backend can iterate over all accounts, if online
create Personas from TpContacts, if offline create Personas from CM's disk
cache. Of course when folks read the disk DB, since account is offline, the DB
is immutable, so we don't need any fancy change notification.

Folks DB
========
Folks is about merging Personas, so I suggest having a separate DB that does
just that. An individual is in the DB if and only if it has more than one
Persona. The DB must be optimized for 2 types of queries: 1) given an individual
uid, give all its linked persona uid. 2) given a persona uid, give its
individual uid.

This means we have to define persona and individual uid:

 * Persona uid: I suggest using N9 format:
   "telepathy://<account path>/<contact-id>",
   "eds://<ESource id>/<EContact id>".
   Or anything from which we can fetch a TpContact or EContact.

 * Individual uid: there are 2 cases, does it have just one Persona or more?
   - just one: use its persona uid.
   - more than one: use a uuid and keep it in the DB.

We need change notification when that DB is changed. Probably something like
dconf where a daemon does the writing and clients does direct reading. Maybe
crazy idea but could we actually use dconf for this? we could have
/org/folkd/<individual uid> keys of type "as" which is a list of persona uid.
I'm not sure how dconf will scale if we have thousands of such keys?

I think merge/unmerge operations are not something we do daily, so it's not the
most important operation to optimize.

Questions?
==========
I hope my suggestions makes sense. Of course I could be terribly wrong about
some stuff, so please tell me your opinion :-)

Regards,
Xavier Claessens.



More information about the telepathy mailing list