[systemd-devel] Jobs dropped to readily (predm/start dropped as a dep while deleting plymouth-quit/stop)

Colin Guthrie gmane at colin.guthr.ie
Sun Apr 8 16:29:29 PDT 2012


Hi,

I've been banging my head off a brick wall trying to work out why a
given unit never starts. It doesn't fail, it is simply never started.

The unit in question is prefdm, so users are noticing this quite a lot!!
That said, I've only had two reported cases of this occurring, so it's
hard to see how big a problem it is.

Here is a "systemctl dump" of an affected system:
 https://bugs.mageia.org/attachment.cgi?id=1947

And full debug output here:
 https://bugs.mageia.org/attachment.cgi?id=1950

(Full bug is here: https://bugs.mageia.org/show_bug.cgi?id=5262)



Tracing this as best I can:

It seems that prefdm has:
	Before: graphical.target
	Before: atieventsd.service


Now, atieventsd has:
	Before: rinetd.service
	Before: graphical.target


So far to start graphical.target, we'll need to start atieventsd and
then prefdm. That's OK as it stands.


Looking further, rinetd has:
	Before: multi-user.target
	Before: graphical.target



Now this could be where the problem lies.

graphical.target has:
	Requires: multi-user.target
	After: multi-user.target

So multi-user.target has to happen before graphical.target, but even
still, and we can still satisfy all the deps.


But looking at the debug output from this:

[    7.203540] systemd[1]: Found ordering cycle on prefdm.service/start
[    7.209944] systemd[1]: Walked on cycle path to
plymouth-quit.service/stop
[    7.216330] systemd[1]: Walked on cycle path to rc-local.service/start
[    7.222546] systemd[1]: Walked on cycle path to rinetd.service/start
[    7.228624] systemd[1]: Walked on cycle path to atieventsd.service/start
[    7.234617] systemd[1]: Walked on cycle path to prefdm.service/start
[    7.240501] systemd[1]: Breaking ordering cycle by deleting job
plymouth-quit.service/stop
[    7.246492] systemd[1]: Deleting job prefdm.service/start as
dependency of job plymouth-quit.service/stop
[    7.246501] systemd[1]: Found ordering cycle on prefdm.service/stop
[    7.252433] systemd[1]: Walked on cycle path to getty at tty1.service/start
[    7.258347] systemd[1]: Walked on cycle path to
plymouth-quit-wait.service/start
[    7.264249] systemd[1]: Walked on cycle path to rc-local.service/start
[    7.270058] systemd[1]: Walked on cycle path to rinetd.service/start
[    7.275704] systemd[1]: Walked on cycle path to atieventsd.service/start
[    7.281236] systemd[1]: Walked on cycle path to prefdm.service/stop
[    7.286608] systemd[1]: Breaking ordering cycle by deleting job
getty at tty1.service/start




Here we can see why prefdm doesn't get started. It was dropped as a dep
to break an ordering cycle. However, it's actually part of the cycle
itself, and thus it likely should be excluded from the dependant jobs
when they are deleted.

i.e. a job may be a dependency of the job being dropped, but it might
also exist in it's own right as a dep elsewhere. In such circumstances,
shouldn't it be allowed to continue?

Or perhaps dependant jobs should not be cleared in the first loop. i.e.
try continuing without deleting dependant jobs, but keep a list of those
that would be deleted. If the first loop did not solve the problem, then
delete the deps.

Or perhaps when deleting a stop job, we should not delete any dependant
start jobs? Or even somehow process conflicts first before verifying the
order? To explain some of the rules here are:

Conflicts=getty at tty1.service plymouth-quit.service
After=getty at tty1.service plymouth-quit.service

It seems that the code calls transaction_verify_order() first and then
transaction_merge_jobs(). The latter seems to take into consideration
the conflicts_with while the former does not.


A few suggestions above but input from people more familiar with the
code would be very much appreciated.



As this is something that is actively causing a problem, so it would be
good to get an idea as to what potential solutions could be found.


Col




-- 

Colin Guthrie
gmane(at)colin.guthr.ie
http://colin.guthr.ie/

Day Job:
  Tribalogic Limited http://www.tribalogic.net/
Open Source:
  Mageia Contributor http://www.mageia.org/
  PulseAudio Hacker http://www.pulseaudio.org/
  Trac Hacker http://trac.edgewall.org/



More information about the systemd-devel mailing list