[systemd-devel] Empty journal files consume space
Andrei Borzenkov
arvidjaar at gmail.com
Fri Feb 2 10:16:38 UTC 2024
On Fri, Feb 2, 2024 at 12:18 AM Steve Traylen <steve.traylen at cern.ch> wrote:
>
>
> On 01/02/2024 14:48, Steve Traylen wrote:
> > On 01/02/2024 13:45, Andrei Borzenkov wrote:
> >
> >> On Thu, Feb 1, 2024 at 3:25 PM Steve Traylen <steve.traylen at cern.ch>
> >> wrote:
> >>> Hi,
> >>>
> >>> I'm trying to understand why I am only retaining just a couple of days
> >>> of logs when I would like to have more.
> >>>
> >>> The system journalctl head of the logs is only today:
> >>> Feb 01 10:47:14 nodeX.example.ch systemd-journald[722]: Data hash table
> >>> of /var/log/journal/c33ef6d0ada04ec4abc79c567a7d94b0/system.journal has
> >>> a fill level at 75.0 (174765 of 233016 items, 58720256 file size, 335
> >>> bytes per hash table item), suggesting rotation.
> >>> Feb 01 10:47:14 nodeX.example.ch systemd-journald[722]:
> >>> /var/log/journal/c33ef6d0ada04ec4abc79c567a7d94b0/system.journal:
> >>> Journal header limits reached or header out-of-date, rotating.
> >>>
> >>>
> >>> # journalctl --disk-usage
> >>> Archived and active journals take up 8.1G in the file system.
> >>>
> >>> Reality is system journal is tiny:
> >>>
> >>> # du -sh system.journal
> >>> 17M system.journal
> >>>
> >>> However we do have many
> >>>
> >>> # ls -l user-*journal | wc -l
> >>> 1044
> >>>
> >>> and indeed
> >>>
> >>> # du -sh /var/log/journal/c33ef6d0ada04ec4abc79c567a7d94b0
> >>> 8.2G /var/log/journal/c33ef6d0ada04ec4abc79c567a7d94b0
> >>>
> >>> The vast majority of these user journals are empty and offline
> >>>
> >>> # file user-*journal | awk '{print $4, $5}' | sort | uniq -c
> >>> 940 empty, offline
> >>> 102 offline
> >>> 2 online
> >>>
> >>>
> >>> These user journals are all 8.0M is size
> >>>
> >>> So I think I have two questions:
> >>>
> >>> 1) Why am I loosing old logs sooner than I would like - what limit is "
> >>> fill level at 75.0 (174765 of 233016 items"
> >> You did not provide any evidence that logs are lost. Archived
> >> (offline) logs are processed and searched by journalctl so the oldest
> >> available log is the oldest archive file, not the current online file.
> >>
> >> The limit is the fill grade of the hash table in the individual log
> >> file. It is hard coded and unrelated to the limits configured in the
> >> journald.conf. It may affect how long logs are kept if you configured
> >> retention by the number of log files.
> > Thanks for reply.
> >
> > There are no archive files I believe:
> >
> > # ls /var/log/journal/514fed82c54d4a89b9f7f8f33eca1c8e/*system*
> > /var/log/journal/514fed82c54d4a89b9f7f8f33eca1c8e/system.journal
> >
> > The archive files would be alongside the live file I believe.
> >
> > Just tried an explicit " journalctl --rotate" which logs:
> >
> > Feb 01 14:36:33 nodeX.example.ch systemd-journald[658]: System Journal
> > (/var/log/journal/514fed82c54d4a89b9f7f8f33eca1c8e) is 8.0G, max 3.0G,
> > 0B free.
> > Feb 01 14:36:40 nodeX.example.ch systemd-journald[658]: Received
> > client request to rotate journal, rotating.
> > Feb 01 14:36:40nodeX.example.ch systemd-journald[658]: Deleted empty
> > archived journal
> > /var/log/journal/514fed82c54d4a89b9f7f8f33eca1c8e/user-1234 at 537a18390e124dd6b4cf41a69ef5780d-0000000000000000-0000000000000000.journal
> > (3.5M).
> > Feb 01 14:36:40 lxplus978.cern.ch systemd-journald[658]: Deleted empty
> > archived journal
> > /var/log/journal/514fed82c54d4a89b9f7f8f33eca1c8e/user-1235 at d7d23966c1454001a714ee5aef039c60-0000000000000000-0000000000000000.journal
> > (3.5M).
> >
> > So now maybe I understand at rotation I am over the configured max of
> > 3GB so perhaps no archive is generated. Looking at another node with
> > fewer number of users having ever logged in I have the archive of
> > of the system log and a longer history. Those 940 "empty, offline"
> > user journals consume the space providing no particular value.
> >
> > No other indication that rotation may not have worked.
> >
> >
> >>> 2) Is there a safe mechanism to delete those empty offline user
> >>> journals?
> >>>
> >> Just delete them.
>
> Wrote a tiny script to delete them:
>
> for FILE in /var/log/journal/$(cat
> /etc/machine-id)/user-+([0-9]*).journal ; do
> if [ "$(file --brief $FILE)" == 'Journal file empty, offline' ]
> ; then
> rm -f $FILE
> echo "$(basename $FILE) was empty and offline so removed"
> fi
> done
>
> works perfectly - unfortunately about 20 seconds later journald (I
> presume) re-creates them all despite the vast majority
> of users having no current processes on the nodes.
>
>
Try enabling debug logs for journald. Empty files should be removed by
journal anyway, so maybe they are not considered really empty?
> >>
> >>> Thanks.
> >>>
> >>> Steve.
> >>>
> >>> Version and configuration:
> >>>
> >>> systemd-252-18.el9 - RHEL9 with a configuration of:
> >>>
> >>> [Journal]
> >>> Storage = persistent
> >>> SplitMode = uid
> >>> SystemMaxUse = 3G
> >>> SystemKeepFree = 10G
> >>> MaxRetentionSec = 1year
> >>>
> >>> # df -h /
> >>> Filesystem Size Used Avail Use% Mounted on
> >>> /dev/vda1 80G 65G 16G 81% /
> >>>
> >>>
More information about the systemd-devel
mailing list