[systemd-devel] [EXT] Re: Q: non-ASCII in syslog

Mantas Mikulėnas grawity at gmail.com
Thu Apr 28 07:39:07 UTC 2022


On Thu, Apr 28, 2022 at 10:32 AM Ulrich Windl <
Ulrich.Windl at rz.uni-regensburg.de> wrote:

> >>> Mantas Mikulenas <grawity at gmail.com> schrieb am 27.04.2022 um 12:03 in
> Nachricht
> <CAPWNY8XO0tu6EdpJO538qyGBJ0kOmZo5iCaoJpPc8kt4QZ+vXg at mail.gmail.com>:
> > On Wed, Apr 27, 2022 at 10:09 AM Ulrich Windl <
> > Ulrich.Windl at rz.uni-regensburg.de> wrote:
> >
> >> Hi!
> >>
> >> Having written an RFC 3164 compatible syslog daemon, I noticed that
> systemd
> >> created syslog messages with non-ASCII characters.
> >> The problem is that a remote syslogd can hardly guess the correct
> character
> >> set (I'm using rsyslog to forward local messages to a remote server).
> >>
> >> Example of such message:
> >> systemd-tmpfiles[3311]: [/usr/lib/tmpfiles.d/svnserve.conf:1] Line
> >> references
> >> path below legacy directory /var/run/, updating /var/run/svnserve →
> >> /run/svnserve; please update the tmpfiles.d/ drop-in file accordingly.
> >>
> >> (The arrow is encoded as three bytes (\xe2\x86\x92))
> >>
> >> RFC 5425 syslog messages require the use of a BOM (%xEF.BB.BF) at the
> >> beginning of a message if the message used UTF-8:
> >>
> >>       MSG             = MSG-ANY / MSG-UTF8
> >>       MSG-ANY         = *OCTET ; not starting with BOM
> >>       MSG-UTF8        = BOM UTF-8-STRING
> >>       BOM             = %xEF.BB.BF
> >>
> >> Wouldn't it make sense to add such a BOM for RFC 3164 syslog messages
> also
> >> if
> >> non-ASCII (i.e.: UTF-8) encoded characters are used?
> >>
> >
> > RFC 3164 over a local socket from journald to local rsyslogd? The local
>
> Actually I wasn't quite sure about the default config in SLES12.
> It seems the flow is journald -> local rsyslogd -> remote syslogd
>
> > rsyslogd already knows if messages are UTF-8 because the system's $LANG
> > (well, nl_langinfo) says so. And if rsyslog can't trust that for some
> > reason (e.g. because a user might have a different locale), then
> > systemd-journald won't be able to trust it either, so it won't know
> whether
> > it could add the BOM.
>
> How could a remote syslog server know what the locale on the sending system
> is?
>

It's not remote, it's local. I'm talking about the one that's receiving
messages from journald on the same machine.


>
> >
> > RFC 3164 over the network to a remote server? Outside the scope for
> > systemd, since it doesn't generate the network packets; your local
> rsyslogd
> > forwarder does. (Also, why RFC 3164 and not 5425?)
>
> If you look outside the world of systemd, about 99% of systems create the
> RFC
> 3164 type of messages.
> Some may send non-ASCII too, however.
>

Still outside the scope of systemd. Systemd doesn't send RFC 3164 messages
over the network, either.


>
> >
> > Generally, if a message successfully decodes as UTF-8 then it's most
> likely
> > actual UTF-8 (and if UTF-8 decode fails then you fall back to ISO8859-1).
> > Various old systems get away with this without needing a UTF-8 BOM.
>
> Yes, you can just output what you received, hoping the messages will be
> presented correctly.
> I't just like sending 8-bit E-Mmail without a coding system or charset in
> the
> past.
>

Which is not what I was saying, but sure, whatever.

-- 
Mantas Mikulėnas
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/systemd-devel/attachments/20220428/9042fc55/attachment.htm>


More information about the systemd-devel mailing list