[systemd-devel] journalctl v202, loop endlessly

Frederic Crozat fcrozat at suse.com
Fri May 31 06:03:48 PDT 2013


Le mercredi 08 mai 2013 à 03:53 +0200, Sébastien Luttringer a écrit :
> On Tue, May 7, 2013 at 4:01 AM, Cristian Rodríguez
> <crrodriguez at opensuse.org> wrote:
> > El 05/05/13 13:17, Sébastien Luttringer escribió:
> >
> >> Hello,
> >>
> >> "journcalctl --no-pager" or" journalctl | cat" produce enless content
> >> by looping accross journal entries. The date in lines restart from the
> >> beginning when the end is reached.
> >
> >
> > We have reports about this behaviour in openSUSE as well, the problem is
> > that I cannot reproduce it, even with your sample journal files..
> Bizarre.
> 
> I'm able to reproduce the bug with the attached C code (from man 2
> sd_journal_next) and the provided tarball.
> 
> I also built systemd from git tree (~ v203) and bug still occur.
> 
> With some profiling + tracing, I guess it comes from sd_journal_next
> in src/journal/journalctl.c which never return 0.
> So the following break condition is never true and the loop is infinite.
> 
> Some breakpoints in gdb shows a kind of switch between system.journal
> and user-18136.journal files before printing each log lines before the
> bug occur.
> When we have looped, only one journal file is looked between entries.
> I don't know if it's usefull.
> 
> (gdb) info breakpoints
> 16      breakpoint     keep y   0x000000000042b4d7 in output_short at
> src/shared/logs-show.c:292
> stop only if strcmp(message, "New session 6 of user seblu.") == 0
> breakpoint already hit 5 times
> 17      breakpoint     keep y   0x00000000004136fa in
> real_journal_next at src/journal/sd-journal.c:913
> stop only if found
> breakpoint already hit 24295 times
>         silent
>         print f->path
>         c
> 
> 
> mai 05 17:08:49 achille.seblu.net named[274]: dumpstats complete
> $32223 = 0x6556b0
> "/tmp/toto/3d69b1d747744a53ac1c25b78eee2764/user-18136.journal"
> $32224 = 0x655ba0 "/tmp/toto/3d69b1d747744a53ac1c25b78eee2764/system.journal"
> mai 05 17:11:26 achille.seblu.net sshd[7372]: Accepted publickey for
> seblu from 82.234.72.62 port 6239 ssh2
> $32225 = 0x6556b0
> "/tmp/toto/3d69b1d747744a53ac1c25b78eee2764/user-18136.journal"
> $32226 = 0x655ba0 "/tmp/toto/3d69b1d747744a53ac1c25b78eee2764/system.journal"
> mai 05 17:11:26 achille.seblu.net sshd[7372]: pam_unix(sshd:session):
> session opened for user seblu by (uid=0)
> $32227 = 0x6556b0
> "/tmp/toto/3d69b1d747744a53ac1c25b78eee2764/user-18136.journal"
> $32228 = 0x655ba0 "/tmp/toto/3d69b1d747744a53ac1c25b78eee2764/system.journal"
> mai 05 17:11:26 achille.seblu.net systemd-logind[251]: New session 6
> of user seblu.
> ---Type <return> to continue, or q <return> to quit---
> 
> Breakpoint 16, output_short (f=0x7ffff73172a0 <_IO_2_1_stdout_>,
> j=0x63df40, mode=OUTPUT_SHORT, n_columns=118,
>     flags=OUTPUT_COLOR) at src/shared/logs-show.c:292
> 292        if (flags & OUTPUT_CATALOG)
> (gdb)
> Continuing.
> $32229 = 0x6556b0
> "/tmp/toto/3d69b1d747744a53ac1c25b78eee2764/user-18136.journal"
> mai 05 17:11:26 achille.seblu.net sshd[7374]: subsystem request for
> sftp by user seblu
> $32230 = 0x655950
> "/tmp/toto/3d69b1d747744a53ac1c25b78eee2764/system at 0004db40759a1f99-767afdbe88695384.journal~"
> -- Reboot --
> avril 22 03:00:42 zeus.seblu.net systemd-logind[248]: Removed session 7.
> $32231 = 0x655950
> "/tmp/toto/3d69b1d747744a53ac1c25b78eee2764/system at 0004db40759a1f99-767afdbe88695384.journal~"
> avril 22 03:01:00 zeus.seblu.net fcron[10414]: Job /usr/sbin/run-cron
> /etc/cron.hourly started for user systab...0415)
> $32232 = 0x655950
> "/tmp/toto/3d69b1d747744a53ac1c25b78eee2764/system at 0004db40759a1f99-767afdbe88695384.journal~"
> avril 22 03:01:02 zeus.seblu.net fcron[10414]: Job /usr/sbin/run-cron
> /etc/cron.hourly completed
> $32233 = 0x655950
> "/tmp/toto/3d69b1d747744a53ac1c25b78eee2764/system at 0004db40759a1f99-767afdbe88695384.journal~"
> avril 22 03:07:09 zeus.seblu.net named[8102]: received control channel
> command 'reload'
> $32234 = 0x655950
> "/tmp/toto/3d69b1d747744a53ac1c25b78eee2764/system at 0004db40759a1f99-767afdbe88695384.journal~"
> avril 22 03:07:09 zeus.seblu.net named[8102]: loading configuration
> from '/etc/named.conf'
> $32235 = 0x655950
> "/tmp/toto/3d69b1d747744a53ac1c25b78eee2764/system at 0004db40759a1f99-767afdbe88695384.journal~"
> avril 22 03:07:09 zeus.seblu.net named[8102]: reading built-in trusted
> keys from file '/etc/bind.keys'

So, I've tested your journal, as well as journals from other reporters
( https://bugzilla.novell.com/show_bug.cgi?id=817778 ) and this infinite
loop is a regression caused by commit
a3e6f050de81a9830e52af09d5d38dad9a356e3b (journal: when iterating
through a file we might lose messages when changing
https://bugs.freedesktop.org/show_bug.cgi?id=63672 )


-- 
Frederic Crozat <fcrozat at suse.com>
SUSE



More information about the systemd-devel mailing list