[systemd-devel] Topics for the Linux Storage, Filesystem & MM Summit

David Timothy Strauss david at davidstrauss.net
Tue Mar 25 17:01:57 PDT 2014


Other than for zswap discussions, I spent the whole time on the
storage and file system side. I've filled in the topics that came up
(or I could topically ask about), and I have some good contacts now if
answering any others proves essential.

Outside the file system/MM space, at least one Ganesha developer is
working on a comprehensive credential-setting/caching syscall that
might be useful for systemd's privilege dropping for starting
services. It would set credentials and return an fd that is usable to
rapidly set the same credentials again. He's planning on stuffing the
file descriptors into an LRU cache to both reduce syscalls and
accelerate the application of credentials in the kernel. His work is
focused on FS permissions, but it would make sense for it to be
available for general ones, too.

It may be useful to see if we can use Trinity for kdbus and systemd testing.

On Thu, Feb 27, 2014 at 3:09 PM, Lennart Poettering
<lennart at poettering.net> wrote:
> - quota for tmpfs
>
> - saner autofs elapse logic (no blocking ioctls...)
>
> - uuids for btrfs subvols
>
> - a umount (or maybe last-change) timestamp in the btrfs superblock,
>   which we can use to initialize the system clock from if a machine
>   lacks RTC or has a dead battery. Even better: a unified ioctl() to
>   query this from all file systems the same way.
>
> - fanotify: accessible to unprivileged users

I asked about the state of this given Ganesha's userspace operation. I
don't think there are any changes. Many services using fanotify seem
to require root privileges for other reasons, too.

> - fanotify: events for renames
>
> - fanotify: pass info about open() flags to monitoring processes
>
> - fanotify: when getting getting a notification for close, actually
>   get information whether the file was changed or not
>
> - an ioctl-based way to change FAT file system labels
>
> - cheaper xattrs. currently querying xattrs on most file systems is
>   prohibitively slow, since it results in seeks and whatnot. Which has
>   the result that pretty nobody uses them. One way to make things better
>   would be to maybe expose in some fstat2() call a flag whether there
>   even are xattrs, so that apps could check for that flag before
>   actually trying to read them

This is being suggested as a readdir+ enhancement for Ganesha's sake.
It would only be a flag that xattrs exist.

> - An API to query the birthtime of files. ext234 actually stores that
>   and keeps it up-to-date, but there's no API to get to this data
>
> - An fsetxattrat() call, so that race-free selinux relabelling can be
>   done
>
> - a way to mark an entire tree of mounts read-only with one call. i.e. a
>   working combination of MS_REC|MS_RDONLY
>
> - Allow creating read-only bind mounts in a single mount() invocation,
>   instead of requiring two. Similar, a way to set the propagation
>   settings for a mount when one creates it, rather than requiring two
>   mount() invocations for that.
>
> - Swappiness control for individual pages via madvise()
>
> - volatile ranges

There was a session on this, but I went to Direct I/O instead.

> - A better SIGBUS/SIGSEGV API (for accessing invalidated memory maps),
>   that actually works for libraries. i.e. a sane way how libraries can
>   register handlers for specific memory regions they maage. Currently
>   there can only be one handler for the entire process which makes this
>   totally unavailable for libraries, since they'd always step on each
>   others toes. Probably hard one to get into the brains of kernel guys,
>   since for them that is a userspace problem.


More information about the systemd-devel mailing list