[systemd-devel] fsckd needs to go

Lennart Poettering lennart at poettering.net
Tue Apr 7 07:14:52 PDT 2015


On Mon, 06.04.15 15:12, Martin Pitt (martin.pitt at ubuntu.com) wrote:

> Hello Lennart, all,
> 
> Lennart Poettering [2015-04-03 14:58 +0200]:
> > To start with, the code is really wrong, it should never have been
> > merged in its current state, the read/write logic for the sockets is
> > completely borked (I cannot even boot my own machine reliably with
> > it!).
> 
> This is surprising indeed. If that's not just the journald/logind/D-Bus
> corruption (which we still haven't tracked down properly), do you have
> a journal of a hung boot? We never saw a boot failure due to fsck so
> far, so I'm naturally very interested in seeing what's wrong.

Sorry, not logs here, I removed the thing already here. Sorry.

> > And to my knowledge there has been no attempt to fix all of that,
> > even though I asked for it.
> 
> As far as I see, every point that came up during reviews, including
> your recent one about "don't route fsck output through systemd-fsck"
> got addressed (that latter patch hasn't been committed though, I
> thought you wanted to review it yourself).

Well, the asnc IO socket handling thing was not dealt with. The newest
patches still use fgets(). Using stdio for processing sockets is
generally not a good idea, since its blocking. And since you want to
process multiple connections at the same time you don't want
blocking. This is really broken.

Currently, if one fsck sends half a line, then this causes your daemon
to hang forever... THis is not acceptable in our sources, sorry.

> > It also doesn't do at all what I suggested initially, as the flow of
> > data is now fsck → systemd-fsck → systemd-fsckd → plymouth, and
> > that's just crazy, that's two steps too many.
> 
> With the above patch it's fsck -> systemd-fsckd → plymouth, and I
> don't see how to eliminate yet another step?

For example, by making ply listen directly on the socket, instead of
making this indirect via fsckd...

> > Then, there's my general reservation with fsckd at all: file systems
> > that still require offline fsck are certainly not the future, but we
> > develop stuff for the future
> 
> I do agree with the sentiment; let me assure you that we don't easily
> spend days on such stuff in vain, but it's because there are millions
> of existing installations out there which still do have ext4 and fsck.
> If systemd upstreams say "we don't care about existing products, only
> about a future with just btrfs" that's your prerogative of course, but
> distros need to have a more product-oriented focus :-/

This only is a one reason of many. The killer issue really is the
safety issue. We shouldn't include code in systemd that makes
dangerous things like killing running fscks an easily accessible
operation, that has a graphical UI and requires no authentication.

> > I hope such a solution is acceptable?
> 
> The data flow is very similar to what we have now, so this mostly
> amounts to maintaining fsckd in the systemd sources vs. maintaining it
> separately in Debian/Ubuntu. I'd be interested in what
> RHEL/SUSE/Arch/etc. want to do.

We never had code for this in Fedora/RHEL, and that's not going to
change. The ability to have a graphical UI for killing fscks without
authentication was an Ubuntu thing, and I figure it's going to stay
one.

Lennart

-- 
Lennart Poettering, Red Hat


More information about the systemd-devel mailing list