Cleaning of $XDG_CACHE_HOME and $XDG_CACHE_HOME/thumbnails
Bollinger, John C
John.Bollinger at STJUDE.ORG
Wed Feb 26 19:19:29 UTC 2020
On Wednesday, February 26, 2020 11:19 AM, Benjamin Berg wrote:
> So, lets try to clarify a few points.
> I do think that it is a good idea to use tmpfiles.d and for applications to ship appropriate configurations. There seems to be an overall agreement that tmpfiles.d is an appropriate way of doing this and that we could standardise on it.
Let's do get some clarity on this point, as various comments have seemed to say different things. Are you suggesting that the *configuration format* consumed by systemd-tempfiles be reused by some desktop component, or are you suggesting that cleanup tasks be delegated to systemd-tempfiles itself? Because the latter is a complete non-starter. Even on those systems that have systemd-tempfiles and have it enabled, it cannot be relied upon to clean up anything in user directories, because such directories may not be accessible to it. On the systems I manage, for example, regular users' home directories reside on a network filesystem, and local-machine administrative accounts do not have any privileges there. Moreover, regular users do not have authority to modify the configuration of systemd-tempfiles, yet they must be afforded ultimate control over any cleanup policy applying to the space afforded to them.
> The strong argument in favour is that we have an external service that guarantees cleanup even if the application is not running regularly.
> For example, I have multiple .cache/* directories with relevant amounts of junk data for applications that I have not used in months to years.
Why is that a strong argument? Perhaps this is part of the issue. I don't see any particular imperative for cleaning up cache files. As far as I'm concerned, it is a "nice-to-have" feature, not a "must-have", and nice only if it does not interfere with my work. I agree that it would be particularly nice to have a common component that is responsible for such cleanup so that one does not need to rely on individual applications to manage their own cache usage directly. That's a feature that could even be sold to application developers, but ultimately, and perhaps this is my key point, users should be responsible for their own usage. If they are using too much, then that's what storage quotas are for. If they are not using too much, then what's the big deal about how they are using it?
> But, I would like to go one step further and make this an opt-out.
> Which is what appears to be triggering the opposition here. The argument in favour is a bit more complicated, but I think it boils down
> * The behaviour becomes explicit rather than an implicit "keep
> forever" policy. I do feel that this is really good in principle.
What do you think is good about that, exactly? What actual problem does it solve, that cannot be addressed as well or better via some other mechanism? "Alice is using a lot of disk space" is not a problem in itself, nor is "Bob has a lot of files that he hasn't accessed in months".
> * In principle, I don't think it is sane to keep caches forever.
Perhaps the problem is with your conception of what caches, in the sense $XDG_CACHE_HOME, are or are for. I get the impression that you have a pretty narrow view of that, which may be coloring your opinion here.
But even if all uses of $XDG_CACHE_HOME were analogous to browser caches and thumbnail caches (which is not the case), what is insane about leaving the cache contents lying around indefinitely, if you don't need the space for anything else?
> So I
> do believe it makes sense to define the expectation that
> $XDG_CACHE_HOME is cleaned eventually even if the application is not
> run regularly.
I am doubtful that you'll attract a majority to that position, at least in its full generality, and I am confident that you will not develop a consensus around it.
> * If a user just stops using an application and removes the package,
> then we should clean the cache. This works very nicely with an opt-
> out solution, as the tmpfiles.d config is removed and the default
> configuration kicks in and cleans things up.
That the package has been removed by a sysadmin is not a clear indication that it is reasonable to remove users' associated cache files. Even if we interpret it that way, though, there's an important distinction between "can be removed" and "should be removed". As a matter of general policy, the system should be exceedingly careful about mucking with user files.
As an additional practical matter, if users can install custom cache-management configuration -- as indeed they should be allowed to do -- then an opt-out solution does not ensure that removing the package would cause cleanup to kick in for all users.
> And yes, I do agree that it may well be a bit painful at the point when the switch is flipped. I wouldn't expect that to happen for another 1-2 years after the specification changed though; and I expect that some distributions would play safe und would wait even longer.
I do not expect any distribution ever to "flip the switch", and this is only partly about the immediate pain that would be involved. It is more about the general principle that only in truly exceptional circumstances should anyone other than the user and / or agents acting at their direction be empowered to manage the contents of the storage afforded to the user.
> So yeah, I don't feel that sticking to the status-quo of never deleting application caches is sane.
I do not accept that as an accurate characterization of the status quo. There is presently little _automatic_ deletion of cache content, especially if you consider only centrally-managed deletion, but that's not the same thing as never deleting. Even if it were, however, it's not clear what leads you to characterize such a situation as insane. I do not accept "because the disk will eventually fill up" as a supportive argument because (1) that's not a certainty in practice, and (2) if the disk (or at least the user's quota) fills then that gives the user plenty of motivation to clear cached data, and when they do, we are no longer in "never deleting" territory.
> I am happy to reconsider my position. But I really don't find it a very convincing counter argument that the transition may be painful in a few cases.
Some specific painful events may be a lot more consequential than you estimate, but that's not the main basis for my objections. It is a matter of principle for me. I take it as axiomatic that the system should not make decisions on behalf of users about the uses to which they put the storage allotted to them. I accept some narrow categories of exceptions, but the proposed opt-out-only cache cleanup is nowhere near any of them.
Email Disclaimer: www.stjude.org/emaildisclaimer
Consultation Disclaimer: www.stjude.org/consultationdisclaimer
More information about the xdg