[Telepathy] thinking about a new log format for telepathy-logger
Senko Rasic
senko at senko.net
Tue Mar 16 06:33:50 PDT 2010
Hi,
On 03/16/2010 11:56 AM, Cosimo Alfarano wrote:
> My mention to transactions was about it, I remember that savepoints
> (sqlite sobsitute for them) have limitations, which he explained but I
> cannot find anymore. I'll contact him again.
Transactions in SQLite can't be nested (inside a single connection to
the database). This means that if you try to use both writing and
reading in the same connection (interleaved), and you want to use
transactions (and you probably want for modifying the database, if
nothing else, then to ensure a complex series of operations leaves the
db in a consistent state), you're going to hit problems.
This can be avoided by ensuring that you're using different connection
for writing to the db (this might include SELECTs etc., needed for the
update) and reading from the db (when you actually want to pull data
from it), so you don't need to worry about interleaving. Also then you
don't need to worry about accessing from different threads (if SQLite is
properly compiled to support multithreaded use, which IIRC it is by
default on nowadays systems), or different processes.
The other thing to worry about is writing to the db from multiple
processes. You can do it (SQLite takes care of concurrency issues), but
this means all the components must deal with db management (what to do
if the db needs to be created for the first time, or repaired, or
upgraded to new schema (eventually)? what to do if there's a lock on the
db while you're trying to do that? etc. ... gazillion nasty little
surprises you thought using RDBMS would free you off).
So, IMHO the best way to use SQLite is for the low-level storage, and
having only one service (the Observer) log to it. Prefferably we could
have the other components use the service for querying too, instead of
reading the DB directly.
> A logger doesn't need to push loads of data at the same time. But it
> might need instead to rollback. Currently we don't need it, though.
Hm, in which cases (besides error-while-logging?) would a rollback be
needed?
From my experience, performance wise, SQLite's bottleneck is twofold:
1. it wants to fdatasync() the file to ensure the contents are
actually written to disk.
2. the transaction journal is on-disk by default (and fsynced, see #1)
Now, these two defaults are the best you can do from data-integrity
standpoint, but they cost you the performance (see the Firefox drama a
year or two ago as an extreme example [the whole disk was fsynced
instead of a single file, though]). Depending on how you want to trade
security for performance, there are several options:
* don't fdatasync() the file (& hope the system won't crash)
* use in-memory journal with transactions (& accept some data loss in
event of Observer or system crash)
Time/message count based transactioning might make sense. E.g. don't log
more than once a second, or more than once in 5 seconds unless more than
100 messages are queued. But in this case, there's an issue of lag
(between when the message was seen and when it was securely logged),
which means components using the logger can't rely on the information
being immediately present there - instead, these components should be
Observers themselves, too (so I don't think it's a real-world issue).
On modern desktop systems I believe the performance of the default setup
is more than adequate to deal with high number of incoming messages
(e.g. #ubuntu chatter on release day :) The performance problem is more
of an issue on embedded devices with slower disks.
Regards,
Senko
More information about the telepathy
mailing list