Trash spec: caching the size of the trash directory

Alexander Larsson alexl at redhat.com
Fri Aug 21 01:35:21 PDT 2009


On Wed, 2009-08-19 at 16:30 +0200, David Faure wrote:
> On Wednesday 19 August 2009, Alexander Larsson wrote:
> > On Wed, 2009-08-19 at 10:28 +0200, David Faure wrote:
> > > On Wednesday 19 August 2009, Alexander Larsson wrote:
> > > > On Thu, 2009-08-13 at 22:17 +0200, David Faure wrote:
> > > > > And then it's up to implementations to update that value, when
> > > > > adding a file to the trash, restoring a trashed file, deleting a
> > > > > trashed file, and when emptying the trash.
> > > > >
> > > > > Is this OK with other implementors of the trash spec? Can it be added
> > > > > to the spec?
> > > >
> > > > I'm a bit concerned about the performance costs of this. It means that
> > > > you need to do a deep recursion into a folder counting its size when it
> > > > is trashed. Whereas otherwise trashing a large folder structure is O(1)
> > > > it now becomed O(number of files) and does a lot more I/O.
> > >
> > > Correct. However the alternative (in order to implement a max size of the
> > > trash) is that every operation that puts a new file into the trash is
> > > O(number of files in the trash), which is clearly a lot worse!
> > >
> > > Do you agree that the feature "maximum size of the trash" is useful?
> > > I know many users who do ;-). Yes, even at the expense of a bit of
> > > performance (especially if the size-counting can be done without blocking
> > > the user interface, as is the case for us since the actual trash
> > > operations are handled in the kioslave which is a separate process; I
> > > think it's similar in gio?).
> >
> > What if instead we store the recursive size of each directory that is
> > trashed it in the trashinfo file. Then this could be used to do a pretty
> > fast size count. If some app did not do this we could then fall back to
> > recursing into it (and perhaps even update the trashinfo file for next
> > time).
> 
> That would be better than nothing. But if you trash 10000 single files
> at the toplevel of the trash (no directories), then the performance will be
> as slow as it is today. And for those, reading the size from the .trashinfo
> file is probably slower than just doing a stat() on the file itself...

Yeah, the size attribute would only make sense for directories.

I'm a bit scared about doing incremental updates of the cached total
value. Say you trash a one meg file, you read the cache file and take
the total there, add one meg and then write it back. This is ripe with
race conditions for concurrent trashes, and if not every app that
trashes or deletes trashed files get things exactly correct this
incremental change could go out of sync with reality.

For instance, starting from a trash with a single file of 100 bytes, if
some app trashed a file of 50 bytes and forgot to update the size cache,
then the user removed the initial file from the trash. Now the cache
file says 0 bytes and the mtime of the cache file is later than any
other file in the trash, so this can not be detected.




More information about the xdg mailing list