Trash spec: directory size cache

Mon Apr 15 01:19:55 PDT 2013

Em Mon, 2013-04-15 às 10:07 +0200, Alexander Larsson escreveu:
> On sön, 2013-04-14 at 23:48 +0200, David Faure wrote:
> > To implement a maximum size for the trash directory, one needs to check the 
> > size every time a new item is being trashed. With the current spec, the only 
> > solution is to do a recursive traversal, which is pretty expensive.
> > To make this efficient, we need a cache.
> > My initial idea of a global "total size" cache doesn't work well with older 
> > implementations which don't update that value, so it gets out of date quickly.
> > 
> > Instead, Ryan Lortie and I came up with the following idea, which we would 
> > like to standardize into the trash spec:
> > 
> > For files, we get the file from stat. For dirs, we use a cache:
> > in every trash directory, a metadata file is created, with one entry per 
> > directory (that was trashed by the user).
> > That entry contains the total size in bytes of the directory, and the 
> > modification time of the trashinfo file [*].
> > 
> > The metadata file uses desktop file syntax, where the key is the directory 
> > name, and the value is a pair: size, and mtime.
> > 
> > However the desktop file standard restricts the available characters for keys, 
> > so instead of just writing out the directory name, we write the sha1 of the 
> > directory name (a bit like the thumbnail spec uses sha1s too).
> > 
> > In summary, it would look like this:
> > 
> > [Directories]
> > # One entry per sub-directory of the "files" directory
> > # key = sha1 of the directory name
> > #  value = size in bytes, timestamp of the trashinfo file, in UTC
> > cb58e5c11a6802db43fd82ca8d3c7393353c0eab=25383,2009-07-11T20:18:30
> > f1d2d2f924e986ac86fdf7b36c94bcdf32beec15=2315,2012-04-12T10:05:20
> 
> In general this sounds good to me. I have two minor objections:
> 
> 1: Using sha1 seems wrong to me. There is no need to get an even
> distribution of the keys (like for thumbnail subdirectories), and a sha1
> is slow to calculate. Also, if you ever look at the file manually its
> says very little. I would much prefer simple character escape model, say
> you allow A-Za-z0-9 and everyting else you escape as "-" + the hex
> digits (like "-2d" for "-"). This is valid desktop file keys, are cheap
> to calculate and makes most files readable by humans.

Or base-64 encode the directory name. Even if that fails the "readable
by humans" test, at least it'll avoid slightly differently buggy escape
sequences).

> 2: Don't store the mtime in a format that needs parsing. Time and date
> parsing is a very complicated area that is easy to get wrong. And the
> source is always a stat which is in epoch format, why not just save it
> in the same format to avoid any day/month order issues, timezone
> weirdnesses or whatnot.

ISO 8601 time format requires very little work to parse, and uses
GMT/UTC. It also passes the human-readable test ;)