[PATCH] procfs/dmabuf: Add /proc/<pid>/task/<tid>/dmabuf_fds
Jann Horn
jannh at google.com
Wed Jan 27 10:47:29 UTC 2021
+jeffv from Android
On Tue, Jan 26, 2021 at 11:51 PM Kalesh Singh <kaleshsingh at google.com> wrote:
> In order to measure how much memory a process actually consumes, it is
> necessary to include the DMA buffer sizes for that process in the memory
> accounting. Since the handle to DMA buffers are raw FDs, it is important
> to be able to identify which processes have FD references to a DMA buffer.
Or you could try to let the DMA buffer take a reference on the
mm_struct and account its size into the mm_struct? That would probably
be nicer to work with than having to poke around in procfs separately
for DMA buffers.
> Currently, DMA buffer FDs can be accounted using /proc/<pid>/fd/* and
> /proc/<pid>/fdinfo -- both of which are only root readable, as follows:
That's not quite right. They can both also be accessed by the user
owning the process. Also, fdinfo is a standard interface for
inspecting process state that doesn't permit reading process memory or
manipulating process state - so I think it would be fine to permit
access to fdinfo under a PTRACE_MODE_READ_FSCRED check, just like the
interface you're suggesting.
> 1. Do a readlink on each FD.
> 2. If the target path begins with "/dmabuf", then the FD is a dmabuf FD.
> 3. stat the file to get the dmabuf inode number.
> 4. Read/ proc/<pid>/fdinfo/<fd>, to get the DMA buffer size.
>
> Android captures per-process system memory state when certain low memory
> events (e.g a foreground app kill) occur, to identify potential memory
> hoggers. To include a process’s dmabuf usage as part of its memory state,
> the data collection needs to be fast enough to reflect the memory state at
> the time of such events.
>
> Since reading /proc/<pid>/fd/ and /proc/<pid>/fdinfo/ requires root
> privileges, this approach is not suitable for production builds.
It should be easy to add enough information to /proc/<pid>/fdinfo/ so
that you don't need to look at /proc/<pid>/fd/ anymore.
> Granting
> root privileges even to a system process increases the attack surface and
> is highly undesirable. Additionally this is slow as it requires many
> context switches for searching and getting the dma-buf info.
What do you mean by "context switches"? Task switches or kernel/user
transitions (e.g. via syscall)?
> With the addition of per-buffer dmabuf stats in sysfs [1], the DMA buffer
> details can be queried using their unique inode numbers.
>
> This patch proposes adding a /proc/<pid>/task/<tid>/dmabuf_fds interface.
>
> /proc/<pid>/task/<tid>/dmabuf_fds contains a list of inode numbers for
> every DMA buffer FD that the task has. Entries with the same inode
> number can appear more than once, indicating the total FD references
> for the associated DMA buffer.
>
> If a thread shares the same files as the group leader then its
> dmabuf_fds file will be empty, as these dmabufs are reported by the
> group leader.
>
> The interface requires PTRACE_MODE_READ_FSCRED (same as /proc/<pid>/maps)
> and allows the efficient accounting of per-process DMA buffer usage without
> requiring root privileges. (See data below)
I'm not convinced that introducing a new procfs file for this is the
right way to go. And the idea of having to poke into multiple
different files in procfs and in sysfs just to be able to compute a
proper memory usage score for a process seems weird to me. "How much
memory is this process using" seems like the kind of question the
kernel ought to be able to answer (and the kernel needs to be able to
answer somewhat accurately so that its own OOM killer can do its job
properly)?
More information about the dri-devel
mailing list