[systemd-devel] Swap gets activated twice (through fstab and gpt generators)

Tue Jan 27 14:48:49 PST 2015

On Tue, 27.01.15 23:40, Lennart Poettering (lennart at poettering.net) wrote:

> On Tue, 27.01.15 23:31, Lennart Poettering (lennart at poettering.net) wrote:
> 
> > On Fri, 23.01.15 10:18, Martin Pitt (martin.pitt at ubuntu.com) wrote:
> > 
> > > So perhaps the more robust fix would be to make the gpt generator not
> > > generate swap units if fstab already configures any swap device? I. e.
> > > auto-discovery and swaps in fstab are mutually exclusive then.
> > 
> > Hmm, so there's something fishy here. systemd should already handle
> > this nicely, and I thought I tested this successfully.
> > 
> > The logic here is that when we enumerate through /proc/swaps we
> > already check udev to not only then set the device listed in there to
> > "active", but also all .swap units that are defined by any of its
> > symlink names. This means, activating a swap partition should result
> > in a number of .swap devices to go "active", not just one. THis is
> > visible if you type "systemctl -a -t swap", which should show a number
> > of .device units for the same actual swap device...
> > 
> > Now, if two jobs are queued to up a swap device, using different names
> > for it, like an entry in fstab, and a GPT auto-discovered partition
> > might do it, then this should mean that one of the jobs should be
> > removed by the effect of the other, i.e. the later job should be
> > immediately succeed, since the other job already caused the swap
> > device to go "active".
> > 
> > There must be a bug somewhere with this... Any chance you can boot in
> > debug mode and check how the .swap units change states during the
> > boot, and when the jobs for it are enqueued?
> 
> Hmm, thinking a bit more about this. The problem is probably this one:
> when the jobs are queued we cannot know that the devices they are
> queued by are actually the same, hence both are queued. Now, if the
> .device unit backing the two .swap units, both .swap units are
> suddenly runnable, and hence systemd forks off swapon for both of them
> immediately. it will then eventually see that both .swap devices are
> now active from /proc/swaps, but at that time it already had forked
> off both mkswap's, and one of them will then fail...
> 
> I wonder what we can do about this.
> 
> One approach could be to say that automatically discovered mounts and
> swaps are always dispatched before the ones from /etc/fstab. By
> serializing things it would be guaranteed that one of the mkswaps runs
> first, thus brining up both .swap units, so that the second mkswap
> would not be done, since the .swap would already be up...
> 
> That said, it would of course be nicer if we wouldn't have to
> serialize here...

Another idea might be to simply accept that activating the swap by two
names at the same time can happen concurrently, and teach mkswap in
some way to handle this gracefully.

For example, mkswap could learn a new switch --idempotent or so, which
we could always pass from systemd. If set and if activating the swap
fails with EBUSY because the swap is already activated it would eat
that up and return success. 

Lennart

-- 
Lennart Poettering, Red Hat