[systemd-devel] Changed ordering of systemd-resolved.service

Paul Menzel pmenzel+systemd-devel at molgen.mpg.de
Mon Apr 16 17:20:24 UTC 2018


Dear Dimitri,


On 04/16/18 18:51, Dimitri John Ledkov wrote:
> On 16 April 2018 at 14:25, Paul Menzel wrote:

>> On 04/16/18 12:47, Dimitri John Ledkov wrote:
>>>
>>> On 13 April 2018 at 16:40, Paul Menzel wrote:
>>
>>>> In commit 1f158013 (resolved.service: set DefaultDependencies=no) the
>>>> ordering of systemd-resolved.service was changed. (How do I find the
>>>> merge
>>>> request to find possible discussion? Also the commit message description
>>>> is
>>>> too specific in my opinion, as it doesn’t give a clue that more is
>>>> changed.)
>>>
>>>
>>> https://github.com/systemd/systemd/pull/7609
>>
>> Thank you, no idea, why I didn’t find it with `git log --oneline --graph`.
>> Hmm, looks like, Lennart directly put that commit in master without merging
>> the pull request.
>>
>>>> I like starting systemd-resolved earlier, but unfortunately ordering it
>>>> before `network.target` adds a delay on systems wanting to start as fast
>>>> as
>>>> possible. But why did you change it from `network-online.target` to
>>>> `network.target`? I’d say `network-online.target` is more correct.
>>>>
>>>> For my use case of a fast system start-up, this change delays it by at
>>>> least
>>>> 100 ms, as now it takes longer to reach the end of the network target.
>>>
>>>
>>> cloud-init initializes networking configuration by fetching,
>>> potentially, remote sources to customize an instance on first boot.
>>> Specifically it may dhcp any interface, to reach a metadata source,
>>> download the real networking configuration, reconfigure networking to
>>> match the final networking details (all interfaces / public ip
>>> addresses / etc), and proceed to complete networking.target and
>>> network-online.target.
>>>
>>> This means that resolved is required earlier in the boot cycle. Before
>>> networking.target.
>>
>>
>> Just to be sure, you mean *network.target*, right?
>>
>> Thank you for specifying the requirement. I agree, that it should be started
>> as early as possible, but I disagree with the rest.
>>
>>> There are things that expect network to be up in
>>> "network-online.target", which by some is implied to mean DNS
>>> resolution too, unfortunately.
>>
>>
>> Sorry for being ignorant, but could you please be specific, what these
>> things are. If these units have that requirement order them after
>> `network-online.target`.
>>
>>>> If your systems have problems with it, they have wrong dependencies,
>>>> don’t
>>>> they? Also, they should probably be able to deal with the situation, that
>>>> DNS does not work, as that can happen during operation.
>>>>
>>>> So, I’d really like to rework that ordering change.
>>>
>>>
>>> Reworking that change will break certain public cloud providers
>>> unfortunately because of public clouds metadata providers being odd.
>>>
>>> Note, we cannot use dbus activation in this case as dbus-daemon is not
>>> up yet, and systemd-resolve command line client also does not work at
>>> this point.
>>>
>>> If you want to make it an optional dependency that early, maybe it
>>> will be possible to convert systemd-resolved to be socket activated on
>>> tcp/udp?
>>>
>>> Alternatively, as a system integrator, you may want to change these
>>> dependencies in your distro, especially if you do not configure
>>> resolved _stub resolver_ as the default provider of /etc/resolv.conf
>>> or for example to do not use the recommended default stub-provider
>>> over 127.0.0.53 and instead use the nss module over dbus.
>>>
>>> The above dependencies are correct and recommend, for the default
>>> setup of /etc/resolv.conf pointing at the stub-resolv.conf as
>>> generated by resolved at runtime.
>>>
>>> Specifically, the dependencies as is are "too-early" if one uses the
>>> last two modes of the /etc/resolv.conf handling as described in the
>>> man page -
>>> https://www.freedesktop.org/software/systemd/man/systemd-resolved.service.html#/etc/resolv.conf
>>
>> First, I think, the terminology of *early* leads to misunderstandings. For
>> you it includes ordering with `Before=`, for me it’s just about `After=`
>> statements.
> 
> It's actually both. Cloud-init is a cross-distribution tool, and it
> injects itself at multiple points during boot. It pre-empts networking
> target, is between networking & network-online, and after
> network-online target.
> 
> Without this upstream change, cloud-init was not able to pre-empt
> network.target, was resulting in a dependency cycle, and systems
> resulted booting degraded (due to dependency cycle resolved by
> shooting arbitrary unit in the head), in a default upstream systemd
> configuration.
> 
>> Anyway, regressing the user experience for everyone only because it’s
> 
> Can you please explain what has degraded? starting systemd-resolved
> before or after network*.target shouldn't make any difference in wall
> clock time to reach multi-user.target. And in my boot testing, I did
> not see any boot regressions.

Just look, what is ordered after the network target.

1. units/rc-local.service.in:After=network.target
2. units/systemd-user-sessions.service.in:After=remote-fs.target 
nss-user-lookup.target network.target

Both are needed for the login screen.

> Or are you explicitly measuring time to network.target, separate from
> time to network-online.target, and separate from reaching the default
> target?
> 
> Have you been previously booting with network-online.target &
> systemd-resolved pulled into the default boot target? And if you were
> booting without them, was that expected?

No, `systemd-networkd-wait-online.service` is disabled.

> I am also getting multiple support requests for networking and DNS
> resolution to be available during emergency and maintenance shell
> consoles, thus pulling resolved earlier made a lot of sense to give
> root shell at least some ability to talk to the outside world to
> download fixes to the system.
> 
>> required for cloud-init is not right in my opinion. As you pointed out, the
>> system integrator can adapt certain things, and in my opinion, I throw the
>> ball back to you, and kindly ask you, to adapt systemd locally so it works
>> with your use-case or let’s come up with a better solution.
> 
> Hm... cloud-init is distribution agnostic, packaged and shipped in
> most distributions. And in stock configuration, one would expect any
> Linux distro to work nicely with an upstream releases of cloud-init &
> systemd.

Do I understand it correctly, that systemd was adapted, so that *one* 
tool, cloud-init, could work? Before systemd 237 it worked for a long 
time that DNS resolution did not necessarily had to be working before 
the network target was reached.

> Please explain the regression you have identified, to design a
> solution fit for all purposes.
 >
>> Maybe a new target is needed, where you can order your services after, as
>> ordering them after systemd-resolved.service is too specific.
> 
> Possibly, but what are your requirements which you have noticed to
> have regressed that we need to fix?

It takes longer to reach the login screen.

>> I submitted a merge/pull request to change the ordering [1].
> 
> -1 from me.
> 
> Please explain, in detail, the regression/bug observed before jumping
> onto reverting things. It's not like things are changed without reason
> / without fixing actual production discovered bugs affecting a wide
> array of users (due to public cloud nature).

I reach the console login over 100 ms earlier, when removing the 
ordering. systemd-resolved unfortunately takes so long.


Kind regards,

Paul


>> [1] https://github.com/systemd/systemd/pull/8731

-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 5174 bytes
Desc: S/MIME Cryptographic Signature
URL: <https://lists.freedesktop.org/archives/systemd-devel/attachments/20180416/e18434fe/attachment-0001.bin>


More information about the systemd-devel mailing list