[systemd-devel] how to let systemd hibernate start/stop the swap area?

Sat Apr 1 05:42:03 UTC 2023

On Sat, 1 Apr 2023, Uoti Urpala wrote:
> On Sat, 2023-04-01 at 06:16 +1100, Michael Chapman wrote:
> > On Fri, 31 Mar 2023, Lennart Poettering wrote:
> > [...]
> > > Presumably your system mmaps ELF binaries, VM images, and similar
> > > stuff into memory. if you don't allow anonymous memory to backed out
> > > onto swap, then you basically telling the kernel "please page out
> > > my program code out instead". Which is typically a lot worse.
> > 
> > Yes, but my point is that it _doesn't matter_ if SSH or journald or 
> > whatever is in memory or needs to be paged back in again. It's such a tiny 
> > fraction of the system's overall workload.
> 
> That contradicts what you said earlier about the system actually
> writing a significant amount of data to swap. If, when swap was
> enabled, the system wrote a large amount of data to the swap, that
> implies there must be a large amount of some other data that it was
> able to keep in memory instead.

Buffer cache. Often stuff that the guests never ended up needing again, or 
at least could survive the penalty of having it read back off disk again.

Of course the host kernel didn't know that, since it cannot predict the 
future. All it knows is that IO is happening, and there are idle pages in 
the guest. Of course it's going to steadily push those idle pages out to 
swap. And the graphs I had at the time showed a very nice linearly- 
increasing swap usage -- until the swap was full.

> Linux should not write all information
> from memory to swap just to leave the memory empty and without any
> useful content - everything written to swap should correspond to
> something else kept in memory.
> 
> So if you say that the swap use was overall harmful for behavior,
> claiming that the *size* of other data kept in memory was too small to
> matter doesn't really make any sense. If the swap use was significant,
> then it should have kept a significant amount of some other data in
> memory, either for the main OS or for the guests.

The "harmful behaviour" was the fact that _when_ those guests needed to be 
swapped in, that was unpleasantly slow.

The existence of swap had little to no effect on the running behaviour of 
the guests themselves -- as I keep saying, when you have enough buffer 
cache on the host, having "a bit more" because you've got swap as well 
does very little. You're already in the long tail of your performance 
graphs.

Can I make this any simpler? How about this:

* Whether swap was there or not had _no_ measurable effect on the guests' 
  performance.
* Having swap meant there was a large swap-in penalty in certain 
  circumstances. (Migration was one of them. "Rebooting a Windows VM" was
  another, since Windows apparently likes to zero all of its RAM.)

Does it make sense now?