[systemd-devel] Antw: Re: [EXT] Re: Q: non-ASCII in syslog

Ulrich Windl Ulrich.Windl at rz.uni-regensburg.de
Thu Apr 28 08:51:39 UTC 2022


>>> Mantas Mikulenas <grawity at gmail.com> schrieb am 28.04.2022 um 09:39 in
Nachricht
<CAPWNY8WBtw5kJ80f4uEffYyR_CcY6=zigb8JUM7CYtkP0oWanQ at mail.gmail.com>:
> On Thu, Apr 28, 2022 at 10:32 AM Ulrich Windl <
> Ulrich.Windl at rz.uni-regensburg.de> wrote:
> 
>> >>> Mantas Mikulenas <grawity at gmail.com> schrieb am 27.04.2022 um 12:03 in
>> Nachricht
>> <CAPWNY8XO0tu6EdpJO538qyGBJ0kOmZo5iCaoJpPc8kt4QZ+vXg at mail.gmail.com>:
>> > On Wed, Apr 27, 2022 at 10:09 AM Ulrich Windl <
>> > Ulrich.Windl at rz.uni-regensburg.de> wrote:
>> >
>> >> Hi!
>> >>
>> >> Having written an RFC 3164 compatible syslog daemon, I noticed that
>> systemd
>> >> created syslog messages with non-ASCII characters.
>> >> The problem is that a remote syslogd can hardly guess the correct
>> character
>> >> set (I'm using rsyslog to forward local messages to a remote server).
>> >>
>> >> Example of such message:
>> >> systemd-tmpfiles[3311]: [/usr/lib/tmpfiles.d/svnserve.conf:1] Line
>> >> references
>> >> path below legacy directory /var/run/, updating /var/run/svnserve →
>> >> /run/svnserve; please update the tmpfiles.d/ drop-in file accordingly.
>> >>
>> >> (The arrow is encoded as three bytes (\xe2\x86\x92))
>> >>
>> >> RFC 5425 syslog messages require the use of a BOM (%xEF.BB.BF) at the
>> >> beginning of a message if the message used UTF-8:
>> >>
>> >>       MSG             = MSG-ANY / MSG-UTF8
>> >>       MSG-ANY         = *OCTET ; not starting with BOM
>> >>       MSG-UTF8        = BOM UTF-8-STRING
>> >>       BOM             = %xEF.BB.BF
>> >>
>> >> Wouldn't it make sense to add such a BOM for RFC 3164 syslog messages
>> also
>> >> if
>> >> non-ASCII (i.e.: UTF-8) encoded characters are used?
>> >>
>> >
>> > RFC 3164 over a local socket from journald to local rsyslogd? The local
>>
>> Actually I wasn't quite sure about the default config in SLES12.
>> It seems the flow is journald -> local rsyslogd -> remote syslogd
>>
>> > rsyslogd already knows if messages are UTF-8 because the system's $LANG
>> > (well, nl_langinfo) says so. And if rsyslog can't trust that for some
>> > reason (e.g. because a user might have a different locale), then
>> > systemd-journald won't be able to trust it either, so it won't know
>> whether
>> > it could add the BOM.
>>
>> How could a remote syslog server know what the locale on the sending
system
>> is?
>>
> 
> It's not remote, it's local. I'm talking about the one that's receiving
> messages from journald on the same machine.
> 
> 
>>
>> >
>> > RFC 3164 over the network to a remote server? Outside the scope for
>> > systemd, since it doesn't generate the network packets; your local
>> rsyslogd
>> > forwarder does. (Also, why RFC 3164 and not 5425?)
>>
>> If you look outside the world of systemd, about 99% of systems create the
>> RFC
>> 3164 type of messages.
>> Some may send non-ASCII too, however.
>>
> 
> Still outside the scope of systemd. Systemd doesn't send RFC 3164 messages
> over the network, either.

Correct: It does not send, because it's unable to do so. That's why I used
rsyslogd.

> 
> 
>>
>> >
>> > Generally, if a message successfully decodes as UTF-8 then it's most
>> likely
>> > actual UTF-8 (and if UTF-8 decode fails then you fall back to
ISO8859-1).
>> > Various old systems get away with this without needing a UTF-8 BOM.
>>
>> Yes, you can just output what you received, hoping the messages will be
>> presented correctly.
>> I't just like sending 8-bit E-Mmail without a coding system or charset in
>> the
>> past.

What I meant to say was: Guessing the encoding is a bad concept.

>>
> 
> Which is not what I was saying, but sure, whatever.
> 
> -- 
> Mantas Mikulėnas





More information about the systemd-devel mailing list