[systemd-devel] Changing database formats for systemd-journald

Sat Jun 28 14:02:34 UTC 2025

On 28/06/2025 10:48, Mantas Mikulėnas wrote:
> Maybe it could, but journal entries do not have fixed columns (except 
> for timestamps and seqnum), so you wouldn't have one neat SQL table of 
> entries – you'd probably end up with a large <id, fieldname, value> 
> "pile of attributes" table and I'm not sure if that would perform any 
> better. (The current format *is* indexed – each field has its own index, 
> not just specific ones.) I think the journal is closer to a "wide-column 
> store" or "NoSQL database" than a traditional SQL database? I'm not even 
> close to being knowledgeable in this topic, though.

Look at a NOSQL database, not a NoSQL database ...

There are three generations of NoSQL databases, according to Wikipedia. 
The original database called NoSQL, the second gen Not Only SQL 
databases, primarily Pick aka Multivalue and friends, and lastly the No 
SQL databases.

Multivalue is a hashed key-value string database. WITH A SCHEMA. It's 
descriptive, not prescriptive like relational, so it's much more 
flexible but with the schema you can query it. And you can create 
indices etc etc.

It's easy to prove that - provided the analyst designs the database 
correctly (and here is the major difference between MultiValue and 
Relational - relational throws brute force at finding what it's looking 
for, MultiValue expects the programmer/analyst to use their brain cells) 
- MultiValue data retrieval is "blindingly fast". MultiValue does not 
have/need an optimiser, because the cost of the optimiser exceeds the 
possible gains.

 > and it too sometimes takes minutes to perform the first query from 
cold cache

This is the problem. It's all very well proving "there's no way you can 
retrieve the data any faster", if the fastest possible is still horribly 
slow.

MultiValue though optimises the *common* case of data retrieval, if 
things go wrong the worst case retrieval can go through the roof. 
Relational tries to guarantee all queries take similar time, at the cost 
of all queries getting slower with database size. It's not long before 
MultiValue can provide a five nines guarantee of being faster than 
relational.

Message me off-line if you're interested, or search wikipedia for things 
like "Pick Operating System", "MultiValue Database", or "NoSQL 
databases". You'll probably find a bunch of interesting links. And 
there's a GPL version called ScarletDME, you'll find me on that.

(If you decide it would work for you, I think there's a whole bunch of 
people who would dive in and help to get more exposure for the project.)

Cheers,
Wol