Trash spec: directory size cache

David Faure faure at kde.org
Sun Apr 14 14:48:40 PDT 2013


To implement a maximum size for the trash directory, one needs to check the 
size every time a new item is being trashed. With the current spec, the only 
solution is to do a recursive traversal, which is pretty expensive.
To make this efficient, we need a cache.
My initial idea of a global "total size" cache doesn't work well with older 
implementations which don't update that value, so it gets out of date quickly.

Instead, Ryan Lortie and I came up with the following idea, which we would 
like to standardize into the trash spec:

For files, we get the file from stat. For dirs, we use a cache:
in every trash directory, a metadata file is created, with one entry per 
directory (that was trashed by the user).
That entry contains the total size in bytes of the directory, and the 
modification time of the trashinfo file [*].

The metadata file uses desktop file syntax, where the key is the directory 
name, and the value is a pair: size, and mtime.

However the desktop file standard restricts the available characters for keys, 
so instead of just writing out the directory name, we write the sha1 of the 
directory name (a bit like the thumbnail spec uses sha1s too).

In summary, it would look like this:

[Directories]
# One entry per sub-directory of the "files" directory
# key = sha1 of the directory name
#  value = size in bytes, timestamp of the trashinfo file, in UTC
cb58e5c11a6802db43fd82ca8d3c7393353c0eab=25383,2009-07-11T20:18:30
f1d2d2f924e986ac86fdf7b36c94bcdf32beec15=2315,2012-04-12T10:05:20


To determine size of the trash directory, this leads to the following 
algorithm:

totalsize = 0
prepare empty set of sha1s
list "files" directory, and for each entry:
     stat the entry
     if a file, totalsize += file size
     if a directory,
         stat the trashinfo file to get its mtime
         calculate sha1 of the directory name
         read entry from metadata file
         if entry found
             extract cached_size and cached_mtime
             if mtime != cached_mtime
                  re-calculate directory size
                  update entry (size of directory, mtime of trashinfo file)
         else
             calculate directory size
             write entry (size of directory, mtime of trashinfo file)
         totalsize += directory size
         add sha1 to set of seen sha1s
done

for each entry in the metadata file,
  if entry key is not in the set of seen sha1s
    remove entry


[*] This way, if an older trash implementation deletes and recreates this 
entry, we can detect that the cache entry is stale [even if the directory got 
restored and trashed again, so the mtime of the directory itself didn't 
change, this is why we use the mtime of the trashinfo file, instead].

If there is no objection, I will make a patch for the trash spec.

-- 
David Faure, faure at kde.org, http://www.davidfaure.fr
Working on KDE, in particular KDE Frameworks 5



More information about the xdg mailing list