Trash spec: directory size cache

Mon Apr 15 02:11:42 PDT 2013

On Monday 15 April 2013 10:19:55 Bastien Nocera wrote:
> Em Mon, 2013-04-15 às 10:07 +0200, Alexander Larsson escreveu:
> > On sön, 2013-04-14 at 23:48 +0200, David Faure wrote:
> > > To implement a maximum size for the trash directory, one needs to check
> > > the
> > > size every time a new item is being trashed. With the current spec, the
> > > only solution is to do a recursive traversal, which is pretty
> > > expensive. To make this efficient, we need a cache.
> > > My initial idea of a global "total size" cache doesn't work well with
> > > older
> > > implementations which don't update that value, so it gets out of date
> > > quickly.
> > > 
> > > Instead, Ryan Lortie and I came up with the following idea, which we
> > > would
> > > like to standardize into the trash spec:
> > > 
> > > For files, we get the file from stat. For dirs, we use a cache:
> > > in every trash directory, a metadata file is created, with one entry per
> > > directory (that was trashed by the user).
> > > That entry contains the total size in bytes of the directory, and the
> > > modification time of the trashinfo file [*].
> > > 
> > > The metadata file uses desktop file syntax, where the key is the
> > > directory
> > > name, and the value is a pair: size, and mtime.
> > > 
> > > However the desktop file standard restricts the available characters for
> > > keys, so instead of just writing out the directory name, we write the
> > > sha1 of the directory name (a bit like the thumbnail spec uses sha1s
> > > too).
> > > 
> > > In summary, it would look like this:
> > > 
> > > [Directories]
> > > # One entry per sub-directory of the "files" directory
> > > # key = sha1 of the directory name
> > > #  value = size in bytes, timestamp of the trashinfo file, in UTC
> > > cb58e5c11a6802db43fd82ca8d3c7393353c0eab=25383,2009-07-11T20:18:30
> > > f1d2d2f924e986ac86fdf7b36c94bcdf32beec15=2315,2012-04-12T10:05:20
> > 
> > In general this sounds good to me. I have two minor objections:
> > 
> > 1: Using sha1 seems wrong to me. There is no need to get an even
> > distribution of the keys (like for thumbnail subdirectories), and a sha1
> > is slow to calculate. Also, if you ever look at the file manually its
> > says very little. I would much prefer simple character escape model, say
> > you allow A-Za-z0-9 and everyting else you escape as "-" + the hex
> > digits (like "-2d" for "-"). This is valid desktop file keys, are cheap
> > to calculate and makes most files readable by humans.
> 
> Or base-64 encode the directory name. Even if that fails the "readable
> by humans" test, at least it'll avoid slightly differently buggy escape
> sequences).

Base-64 leads to huge output. I think escaping with '-'+hex is fine (possibly 
faster, more readable, and not as big). I don't see the "buggy" argument.

-- 
David Faure, faure at kde.org, http://www.davidfaure.fr
Working on KDE, in particular KDE Frameworks 5