[Telepathy] thinking about a new log format for telepathy-logger

Senko Rasic senko at senko.net
Tue Mar 16 06:33:50 PDT 2010


Hi,

On 03/16/2010 11:56 AM, Cosimo Alfarano wrote:
> My mention to transactions was about it, I remember that savepoints
> (sqlite sobsitute for them) have limitations, which he explained but I
> cannot find anymore. I'll contact him again.

Transactions in SQLite can't be nested (inside a single connection to 
the database). This means that if you try to use both writing and 
reading in the same connection (interleaved), and you want to use 
transactions (and you probably want for modifying the database, if 
nothing else, then to ensure a complex series of operations leaves the 
db in a consistent state), you're going to hit problems.

This can be avoided by ensuring that you're using different connection 
for writing to the db (this might include SELECTs etc., needed for the 
update) and reading from the db (when you actually want to pull data 
from it), so you don't need to worry about interleaving. Also then you 
don't need to worry about accessing from different threads (if SQLite is 
properly compiled to support multithreaded use, which IIRC it is by 
default on nowadays systems), or different processes.

The other thing to worry about is writing to the db from multiple 
processes. You can do it (SQLite takes care of concurrency issues), but 
this means all the components must deal with db management (what to do 
if the db needs to be created for the first time, or repaired, or 
upgraded to new schema (eventually)? what to do if there's a lock on the 
db while you're trying to do that? etc. ... gazillion nasty little 
surprises you thought using RDBMS would free you off).

So, IMHO the best way to use SQLite is for the low-level storage, and
having only one service (the Observer) log to it. Prefferably we could 
have the other components use the service for querying too, instead of 
reading the DB directly.

> A logger doesn't need to push loads of data at the same time. But it
> might need instead to rollback. Currently we don't need it, though.

Hm, in which cases (besides error-while-logging?) would a rollback be 
needed?

 From my experience, performance wise, SQLite's bottleneck is twofold:
   1. it wants to fdatasync() the file to ensure the contents are
actually written to disk.
   2. the transaction journal is on-disk by default (and fsynced, see #1)

Now, these two defaults are the best you can do from data-integrity 
standpoint, but they cost you the performance (see the Firefox drama a 
year or two ago as an extreme example [the whole disk was fsynced 
instead of a single file, though]). Depending on how you want to trade 
security for performance, there are several options:

  * don't fdatasync() the file (& hope the system won't crash)
  * use in-memory journal with transactions (& accept some data loss in
    event of Observer or system crash)

Time/message count based transactioning might make sense. E.g. don't log
more than once a second, or more than once in 5 seconds unless more than 
100 messages are queued. But in this case, there's an issue of lag 
(between when the message was seen and when it was securely logged), 
which means components using the logger can't rely on the information 
being immediately present there - instead, these components should be 
Observers themselves, too (so I don't think it's a real-world issue).

On modern desktop systems I believe the performance of the default setup 
is more than adequate to deal with high number of incoming messages 
(e.g. #ubuntu chatter on release day :) The performance problem is more 
of an issue on embedded devices with slower disks.

Regards,
Senko


More information about the telepathy mailing list