[systemd-devel] Changing database formats for systemd-journald
Wol
antlists at youngman.org.uk
Sat Jun 28 14:02:34 UTC 2025
On 28/06/2025 10:48, Mantas Mikulėnas wrote:
> Maybe it could, but journal entries do not have fixed columns (except
> for timestamps and seqnum), so you wouldn't have one neat SQL table of
> entries – you'd probably end up with a large <id, fieldname, value>
> "pile of attributes" table and I'm not sure if that would perform any
> better. (The current format *is* indexed – each field has its own index,
> not just specific ones.) I think the journal is closer to a "wide-column
> store" or "NoSQL database" than a traditional SQL database? I'm not even
> close to being knowledgeable in this topic, though.
Look at a NOSQL database, not a NoSQL database ...
There are three generations of NoSQL databases, according to Wikipedia.
The original database called NoSQL, the second gen Not Only SQL
databases, primarily Pick aka Multivalue and friends, and lastly the No
SQL databases.
Multivalue is a hashed key-value string database. WITH A SCHEMA. It's
descriptive, not prescriptive like relational, so it's much more
flexible but with the schema you can query it. And you can create
indices etc etc.
It's easy to prove that - provided the analyst designs the database
correctly (and here is the major difference between MultiValue and
Relational - relational throws brute force at finding what it's looking
for, MultiValue expects the programmer/analyst to use their brain cells)
- MultiValue data retrieval is "blindingly fast". MultiValue does not
have/need an optimiser, because the cost of the optimiser exceeds the
possible gains.
> and it too sometimes takes minutes to perform the first query from
cold cache
This is the problem. It's all very well proving "there's no way you can
retrieve the data any faster", if the fastest possible is still horribly
slow.
MultiValue though optimises the *common* case of data retrieval, if
things go wrong the worst case retrieval can go through the roof.
Relational tries to guarantee all queries take similar time, at the cost
of all queries getting slower with database size. It's not long before
MultiValue can provide a five nines guarantee of being faster than
relational.
Message me off-line if you're interested, or search wikipedia for things
like "Pick Operating System", "MultiValue Database", or "NoSQL
databases". You'll probably find a bunch of interesting links. And
there's a GPL version called ScarletDME, you'll find me on that.
(If you decide it would work for you, I think there's a whole bunch of
people who would dive in and help to get more exposure for the project.)
Cheers,
Wol
More information about the systemd-devel
mailing list